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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 

1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by 
5 such polynucleotides, along with uses for these polynucleotides and proteins, for example 
in therapeutic, diagnostic and research methods. 

2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, 
10 such as iymphokines, interferons, CSFs, chemokines, and interleukins) has matured 
rapidly over the past decade. The now routine hybridization cloning and expression 
cloning techniques clone novel polynucleotides "directly" in the sense that they rely on 
information directly related to the discovered protein (i.e., partial DNA/amino acid 
sequence of the protein in the case of hybridization cloning; activity of the protein in the 
15 case of expression cloning). More recent "indirect" cloning techniques such as signal 
sequence cloning, which isolates DNA sequences based on the presence of a now 
well-recognized secretory leader sequence motif, as well as various PCR-based or low 
stringency hybridization-based cloning techniques, have advanced the state of the art by 
making available large numbers of DNA/amino acid sequences for proteins that are 
20 known to have biological activity, for example, by virtue of their secreted nature in the 
case of leader sequence cloning, by virtue of their cell or tissue source in the case of 
PCR-based techniques, or by virtue of structural similarity to other genes of known 
biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications 
25 in, for example, diagnostics, forensics, gene mapping; identification of mutations 
responsible for genetic disorders or other traits, to assess biodiversity, and to produce 
many other types of data and products dependent on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

30 The compositions of the present invention include novel isolated polypeptides, novel 

isolated polynucleotides encoding such polypeptides, including recombinant DNA 
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molecules, cloned genes or degenerate variants thereof, especially naturally occurring 
variants such as allelic variants, antisense polynucleotide molecules, and antibodies that 
specifically recognize one or more epitopes present on such polypeptides, as well as 
hybridomas producing such antibodies. 
5 The compositions of the present invention additionally include vectors, including 

expression vectors, containing the polynucleotides of the invention, cells genetically 
engineered to contain such polynucleotides and cells genetically engineered to express such 
polynucleotides. 

The present invention relates to a collection or library of at least one novel nucleic 
10 acid sequence assembled from expressed sequence tags (ESTs) isolated mainly by 

sequencing by hybridization (SBH), and in some cases, sequences obtained from one or 
more public databases. The invention relates also to the proteins encoded by such 
polynucleotides, along with therapeutic, diagnostic and research utilities for these 
polynucleotides and proteins. These nucleic acid sequences are designated as SEQ ID NO: 
15 1 - 438 and are provided in the Sequence Listing. In the nucleic acids provided in the 
Sequence Listing, A is adenine; C is cytosine; G is guanine; T is thymine; and N is any of 
the four bases. In the amino acids provided in the Sequence Listing, * corresponds to the 
stopcodon. 

The nucleic acid sequences of the present invention also include, nucleic acid 
20 sequences that hybridize to the complement of SEQ ID NO: 1 - 438 under stringent 
hybridization conditions; nucleic acid sequences which are allelic variants or species 
homologues of any of the nucleic acid sequences recited above, or nucleic acid sequences 
that encode a peptide comprising a specific domain or truncation of the peptides encoded by 
SEQ ID NO: 1 - 438. A polynucleotide comprising a nucleotide sequence having at least 
25 90% identity to an identifying sequence of SEQ ID NO: 1 - 438 or a degenerate variant or 
fragment thereof. The identifying sequence can be 100 base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NO: 1 - 438. The sequence 
information can be a segment of any one of SEQ ID NO: 1 - 438 that uniquely identifies or 
30 represents the sequence information of SEQ ID NO: 1 - 438. 
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A collection as used in this application can be a collection of only one 
polynucleotide. The collection of sequence information or identifying information of each 
sequence can be provided on a nucleic acid array. In one embodiment, segments of 
sequence information is provided on a nucleic acid array to detect the polynucleotide that 
5 contains the segment The array can be designed to detect full-match or mismatch to the 
polynucleotide that contains the segment. Hie collection can also be provided in a 
computer-readable format 

This invention also includes the reverse or direct complement of any of the nucleic 
acid sequences recited above; cloning or expression vectors containing the nucleic acid 

10 sequences; and host cells or organisms transformed with these expression vectors. Nucleic 
acid sequences (or their reverse or direct complements) according to the invention have 
numerous applications in a variety of techniques known to those skilled in the art of 
molecular biology, such as use as hybridization probes, use as primers for PCR, use in an 
array, use in computer-readable media, use in sequencing full-length genes, use for 

15 chromosome and gene mapping, use in the recombinant production of protein, and use in 
the generation of anti-sense DNA or RNA, their chemical analogs and the like. 

In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1-438 or 
novel segments or parts of the nucleic acids of the invention are used as primers in 
expression assays that are well known in the art In a particularly preferred embodiment, the 

20 nucleic acid sequences of SEQ ID NO: 1-438 or novel segments or parts of the nucleic acids 
provided herein are used in diagnostics for identifying expressed genes or, as well known in 
the art and exemplified by Vollrath et at, Science 258:52-59 (1992), as expressed sequence 
tags for physical mapping of the human genome. 

The isolated polynucleotides of the invention include, but are not limited to, a 

25 polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1- 
438; a polynucleotide comprising any of the full length protein coding sequences of SEQ ID 
NO: 1-438; and a polynucleotide comprising any of the nucleotide sequences of the mature 
protein coding sequences of SEQ ID NO: 1-438. The polynucleotides of the present 
invention also include, but are not limited to, a polynucleotide that hybridizes under 

30 stringent hybridization conditions to (a) the complement of any one of the nucleotide 

sequences set forth in SEQ ID NO: 1-438; (b) a nucleotide sequence encoding any one of 
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the amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an 
allelic variant of any polynucleotides recited above; (d) a polynucleotide which encodes a 
species homolog (e.g. orthologs) of any of the proteins recited above; or (e) a 
polynucleotide that encodes a polypeptide comprising a specific domain or truncation of any 
5 of the polypeptides comprising an amino acid sequence set forth in the Sequence listing. 
The isolated polypeptides of the invention include, but are not limited to, a 
polypeptide comprising any of the amino acid sequences set forth in the Sequence Listing; 
or the corresponding full length or mature protein. Polypeptides of the invention also include 
polypeptides with biological activity that are encoded by (a) any of the polynucleotides 

10 having a nucleotide sequence set forth in SEQ ID NO: 1-438; or (b) polynucleotides that 
hybridize to the complement of the polynucleotides of (a) under stringent hybridization 
conditions. Biologically or immunologically active variants of any of the polypeptide 
sequences in the Sequence Listing, and "substantial equivalents" thereof (e.g., with at least 
about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% amino acid sequence identity) 

15 that preferably retain biological activity are also contemplated. The polypeptides of the 

invention may be wholly or partially chemically synthesized but are preferably produced by 
recombinant means using the genetically engineered cells (e.g. host cells) of the invention. 

The invention also provides compositions comprising a polypeptide of the 
invention. Polypeptide compositions of the invention may further comprise an acceptable 

20 carrier, such as a hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The invention also provides host cells transformed or transfected with a 
polynucleotide of the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture 

25 medium under conditions permitting expression of the desired polypeptide, and purifying 
the polypeptide from the culture or from the host cells. Preferred embodiments include 
those in which the protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a 
variety of techniques known to those skilled in the art of molecular biology. These 

30 techniques include use as hybridization probes, use as oligomers, or primers, for PCR, 
use for chromosome and gene mapping, use in the recombinant production of protein, 
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and use in generation of anti-sense DNA or RNA, their chemical analogs and the like. 
For example, when the expression of an mRNA is largely restricted to a particular cell or 
tissue type, polynucleotides of the invention can be used as hybridization probes to detect 
the presence of the particular cell or tissue mRNA in a sample using, e.g., in situ 
5 hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
expressed sequence tags for identifying expressed genes or, as well known in the art and 
exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for 
physical mapping of the human genome. 

10 The polypeptides according to the invention can be used in a variety of 

conventional procedures and methods that are currently applied to other proteins. For 
example, a polypeptide of the invention can be used to generate an antibody that 
specifically binds the polypeptide. Such antibodies, particularly monoclonal antibodies, 
are useful for detecting or quantitating the polypeptide in tissue. The polypeptides of the 

15 invention can also be used as molecular weight markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical 
condition which comprises the step of administering to a mammalian subject a 
therapeutically effective amount of a composition comprising a polypeptide of the 
present invention and a pharmaceutically acceptable carrier. 

20 In particular, the polypeptides and polynucleotides of the invention can be 

utilized, for example, in methods for the prevention and/or treatment of disorders 
involving aberrant protein expression or biological activity. 

The present invention further relates to methods for detecting the presence of the 
polynucleotides or polypeptides of the invention in a sample. Such methods can, for 

25 example, be utilized as part of prognostic and diagnostic evaluation of disorders as 
recited herein and for the identification of subjects exhibiting a predisposition to such 
conditions. The invention provides a method for detecting the polynucleotides of the 
invention in a sample, comprising contacting the sample with a compound that binds to 
and forms a complex with the polynucleotide of interest for a period sufficient to form 

30 the complex and under conditions sufficient to form a complex and detecting the complex 
such that if a complex is detected, the polynucleotide of interest is detected. The 
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invention also provides a method for detecting the polypeptides of the invention in a 
sample comprising contacting the sample with a compound that binds to and forms a 
complex with the polypeptide under conditions and for a period sufficient to form the 
complex and detecting the formation of the complex such that if a complex is formed, the 
5 polypeptide is detected. 

The invention also provides kits comprising polynucleotide probes and/or 
monoclonal antibodies, and optionally quantitative standards, for carrying out methods of 
the invention. Furthermore, the invention provides methods for evaluating the efficacy of 
drugs, and monitoring the progress of patients, involved in clinical trials for the treatment 

10 of disorders as recited above. 

The invention also provides methods for the identification of compounds that 
modulate (i.e., increase or decrease) the expression or activity of the polynucleotides 
and/or polypeptides of the invention. Such methods can be utilized, for example, for the 
identification of compounds that can ameliorate symptoms of disorders as recited herein. 

15 Such methods can include, but are not limited to, assays for identifying compounds and 
other substances that interact with (e.g., bind to) the polypeptides of the invention. The 
invention provides a method for identifying a compound that binds to the polypeptides of 
the invention comprising contacting the compound with a polypeptide of the invention in 
a cell for a time sufficient to form a polypeptide/compound complex, wherein the 

20 complex drives expression of a reporter gene sequence in the cell; and detecting the 

complex by detecting the reporter gene sequence expression such that if expression of the 
reporter gene is detected the compound the binds to a polypeptide of the invention is 
identified. 

The methods of the invention also provides methods for treatment which involve 
25 the administration of the polynucleotides or polypeptides of the invention to individuals 
exhibiting symptoms or tendencies. In addition, the invention encompasses methods for 
treating diseases or disorders as recited herein comprising administering compounds and 
other substances that modulate the overall activity of the target gene products. 
Compounds and other substances can effect such modulation either on the level of target 
30 gene/protein expression or target protein activity. 
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The polypeptides of the present invention and the polynucleotides encoding them 
are also useful for the same functions known to one of skill in the art as the polypeptides 
and polynucleotides to which they have homology (set forth in Table 2); for which they 
have a signature region (as set forth in Table 3); or for which they have homology to a 
5 gene family (as set forth in Table 4). If no homology is set forth for a sequence, then the 
polypeptides and polynucleotides of the present invention are useful for a variety of 
» applications, as described herein, including use in arrays for detection. 

4. DETAILED DESCRIPTION OF THE INVENTION 

10 

4,1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular 
forms "a", "an" and "the" include plural references unless the context clearly dictates 
otherwise. 

15 The term "active" refers to those forms of the polypeptide which retain the 

biologic and/or immunologic activities of any naturally occurring polypeptide. According 
to the invention, the terms "biologically active" or "biological activity" refer to a protein 
or peptide having structural, regulatory or biochemical functions of a naturally occurring 
molecule. Likewise "immunologically active" or "immunological activity" refers to the 

20 capability of the natural, recombinant or synthetic polypeptide to induce a specific 
immune response in appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are 
engaged in extracellular or intracellular membrane trafficking, including the export of 
secretory or enzymatic molecules as part of a normal or disease process. 

25 The terms "complementary" or "complementarity" refer to the natural binding of 

polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 
complementary sequence 3'-TCA-5\ Complementarity between two single-stranded 
molecules may be "partial" such that only some of the nucleic acids bind or it may be 
"complete" such that total complementarity exists between the single stranded molecules. 

30 The degree of complementarity between the nucleic acid strands has significant effects on 
the efficiency and strength of the hybridization between the nucleic acid strands. 
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The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term 
"germ line stem cells (GSCs)" refers to stem cells derived from primordial stem cells that 
provide a steady and continuous source of germ cells for the production of gametes. The 
5 term "primordial germ cells (PGCs)" refers to a small population of cells set aside from 
other cell lineages particularly from the yolk sac, mesenteries, or gonadal ridges during 
embryogenesis that have the potential to differentiate into germ cells and other cells. 
PGCs are the source from which GSCs and ES cells are derived The PGCs, the GSCs 
and the ES cells are capable of self-renewal. Thus these cells not only populate the germ 

10 line and give rise to a plurality of terminally differentiated cells that comprise the adult 
specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides 
which modulates the expression of an operably linked ORF or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably 

15 linked sequence" when the expression of the sequence is altered by the presence of the 
EMF. EMFs include, but are not limited to, promoters, and promoter modulating 
sequences (inducible elements). One class of EMFs are nucleic acid fragments which 
induce the expression of an operably linked ORF in response to a specific regulatory 
factor or physiological event. 

20 The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 

"oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or 
the sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic 
or synthetic origin which may be single-stranded or double-stranded and may represent 
the sense or the antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or 

25 RNA-like material. In the sequences herein A is adenine, C is cytosine, T is thymine, G 
is guanine and N is A, C, G or T (U). It is contemplated that where the polynucleotide is 
RNA, the T (thymine) in the sequences provided herein is substituted with U (uracil). 
Generally, nucleic acid segments provided by this invention may be assembled from 
fragments of the genome and short oligonucleotide linkers, or from a series of 

30 oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic acid 
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which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," 
or "segment" or "probe" or "primer" are used interchangeably and refer to a sequence of 
5 nucleotide residues which are at least about 5 nucleotides, more preferably at least about 
7 nucleotides, more preferably at least about 9 nucleotides, more preferably at least about 
1 1 nucleotides and most preferably at least about 17 nucleotides. The fragment is 
preferably less than about 500 nucleotides, preferably less than about 200 nucleotides, 
more preferably less than about 100 nucleotides, more preferably less than about 50 

10 nucleotides and most preferably less than 30 nucleotides. Preferably the probe is from 
about 6 nucleotides to about 200 nucleotides, preferably from about 15 to about 50 
nucleotides, more preferably from about 17 to 30 nucleotides and most preferably from 
about 20 to 25 nucleotides. Preferably the fragments can be used in polymerase chain 
reaction (PCR), various hybridization procedures or microarray procedures to identify or 

15 amplify identical or related parts of mRNA or DNA molecules. A fragment or segment 
may uniquely identify each polynucleotide sequence of the present invention. Preferably 
the fragment comprises a sequence substantially similar to any one of SEQ ID NOs:l- 
438. 

Probes may, for example, be used to determine whether specific mRNA 
20 molecules are present in a cell or tissue or to isolate similar nucleic acid sequences from 
chromosomal DNA as described by Walsh et al. (Walsh, P.S. et al., 1992, PCR Methods 
Appl 1:241-250). They may be labeled by nick translation, Klenow fill-in reaction, PCR, 
or other methods well known in the art Probes of the present invention, their preparation 
and/or labeling are elaborated in Sambrook, J. et al., 1989, Molecular Cloning: A 
25 Laboratory Manual, Cold Spring Harbor Laboratory, NY; or Ausubel, FM. et al., 1989, 
Current Protocols in Molecular Biology, John Wiley & Sons, New York NY, both of 
which are incorporated herein by reference in their entirety. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NOs: 1-438. The. sequence 
30 information can be a segment of any one of SEQ ID NOs: 1-438 that uniquely identifies 
or represents the sequence information of that sequence of SEQ ID NO: 1-438. One such 
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segment can be a twenty-mer nucleic acid sequence because the probability that a twenty- 
mer is fully matched in the human genome is 1 in 300. In the human genome, there are 
three billion base pairs in one set of chromosomes. Because 4 20 possible twenty-mers 
exist, there are 300 times more twenty-mers than there are base pairs in a set of human 
5 chromosomes. Using the same analysis, the probability for a seventeen-mer to be fully 
matched in the human genome is approximately 1 in 5. When these segments are used in 
arrays for expression studies, fifteen-mer segments can be used. The probability that the 
fifteen-mer is fully matched in the expressed sequences is also approximately one in five 
because expressed sequences comprise less than approximately 5% of the entire genome 
10 sequence. 

Similarly, when using sequence information for detecting a single mismatch, a 
segment can be a twenty-five mer. The probability that the twenty-five mer would appear in 
a human genome with a single mismatch is calculated by multiplying the probability for a 
full match (l-^ 25 ) times the increased probability for mismatch at each nucleotide position 

15 (3 x 25). The probability that an eighteen mer with a single mismatch can be detected in an 
anay for expression studies is approximately one in five. The probability that a twenty-mer 
with a single mismatch can be detected in a human genome is approximately one in five. 

The term "open reading frame," ORF, means a series of nucleotide triplets coding 
for amino acids without any termination codons and is a sequence translatable into 

20 protein. 

The terms "operably linked" or "operably associated" refer to functionally related 
nucleic acid sequences. For example, a promoter is operably associated or operably 
linked with a coding sequence if the promoter controls the transcription of the coding 
sequence. While operably linked nucleic acid sequences can be contiguous and in the 
25 same reading frame, certain genetic elements e.g. repressor genes are not contiguously 
linked to the coding sequence but still control transcription/translation of the coding 
sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a 
number of differentiated cell types that are present in an adult organism. A pluripotent 
30 cell is restricted in its differentiation capability in comparison to a totipotent cell. 

10 
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The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an 
oligopeptide, peptide, polypeptide or protein sequence or fragment thereof and to 
naturally occurring or synthetic molecules. A polypeptide "fragment," "portion," or 
"segment" is a stretch of amino acid residues of at least about 5 amino acids, preferably at 
5 least about 7 amino acids, more preferably at least about 9 amino acids and most 

preferably at least about 17 or more amino acids. The peptide preferably is not greater 
than about 200 amino acids, more preferably less than 150 amino acids and most 
preferably less than 100 amino acids. Preferably the peptide is from about 5 to about 200 
amino acids. To be active, any polypeptide must have sufficient length to display 

10 biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by 
cells that have not been genetically engineered and specifically contemplates various 
polypeptides arising from post-translational modifications of the polypeptide including, 
but not limited to, acetylation, carboxylation, glycosylation, phosphorylation, lipidation 

15 and acylation. 

The term "translated protein coding portion" means a sequence which encodes for 
the full length protein which may include any leader sequence or any processing 
sequence. 

The term "mature protein coding sequence" means a sequence which encodes a 
20 peptide or protein without a signal or leader sequence. The <4 mature protein portion" 

means that portion of the protein which does not include a signal or leader sequence. The 
peptide may have been produced by processing in the cell which removes any 
leader/signal sequence. The mature protein portion may or may not include the initial 
methionine residue. The methionine residue may be removed from the protein during 
25 processing in the cell. The peptide may be produced synthetically or the protein may 
have been produced using a polynucleotide only encoding for the mature protein coding 
sequence. 

The term "derivative" refers to polypeptides chemically modified by such 
techniques as ubiquitination, labeling (e.g., with radionuclides or various enzymes), 
30 covalent polymer attachment such as pegylation (derivatization with polyethylene glycol) 
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and insertion or substitution by chemical synthesis of amino acids such as ornithine, 
which do not normally occur in human proteins. 

The term "variant r, (or "analog") refers to any polypeptide differing from naturally 
occurring polypeptides by amino acid insertions, deletions, and substitutions, created 
5 using, e g., recombinant DNA techniques. Guidance in determining which amino acid 
residues may be replaced, added or deleted without abolishing activities of interest, may 
be found by comparing the sequence of the particular polypeptide with that of 
homologous peptides and minimizing the number of amino acid sequence changes made 
in regions of high homology (conserved regions) or by replacing amino acids with 

10 consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides 
may be synthesized or selected by making use of the "redundancy" in the genetic code. 
Various codon substitutions, such as the silent changes which produce various restriction 
sites, may be introduced to optimize cloning into a plasmid or viral vector or expression 

15 in a particular prokaryotic or eukaryotic system. Mutations in the polynucleotide 

sequence may be reflected in the polypeptide or domains of other peptides added to the 
polypeptide to modify the properties of any part of the polypeptide, to change 
characteristics such as ligand-binding affinities, interchain affinities, or 
degradation/turnover rate. 

20 Preferably, amino acid "substitutions" are the result of replacing one amino acid 

with another amino acid having similar structural and/or chemical properties, i.e., 
conservative amino acid replacements. "Conservative" amino acid substitutions may be 
made on the basis of similarity in polarity, charge, solubility, hydrophobicity, 
hydrophilicity, and/or the amphipathic nature of the residues involved. For example, 

25 nonpolar (hydrophobic) amino acids include alanine, leucine, isoleucine, valine, proline, 
phenylalanine, tryptophan, and methionine; polar neutral amino acids include glycine, 
serine, threonine, cysteine, tyrosine, asparagine, and glutamine; positively charged (basic) 
amino acids include arginine, lysine, and histidine; and negatively charged (acidic) amino 
acids include aspaxtic acid and glutamic acid. "Insertions" or "deletions" are preferably 

30 in the range of about 1 to 20 amino acids, more preferably 1 to 10 amino acids. The 

variation allowed may be experimentally determined by systematically making insertions, 
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deletions, or substitutions of amino acids in a polypeptide molecule using recombinant 
DNA techniques and assaying the resulting recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such 
5 alterations can, for example, alter one or more of the biological functions or biochemical 
characteristics of the polypeptides of the invention. For example, such alterations may 
change polypeptide characteristics such as ligand-binding affinities, interchain affinities, 
or degradation/turnover rate. Further, such alterations can be selected so as to generate 
polypeptides that are better suited for expression, scale up and the like in the host cells 

10 chosen for expression. For example, cysteine residues can be deleted or substituted with 
another amino acid residue in order to eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the 
indicated nucleic acid or polypeptide is present in the substantial absence of other 
biological macromolecules, e.g., polynucleotides, proteins, and the like. In one 

15 embodiment, the polynucleotide or polypeptide is purified such that it constitutes at least 
95% by weight, more preferably at least 99% by weight, of the indicated biological 
macromolecules present (but water, buffers, and other small molecules, especially 
molecules having a molecular weight of less than 1000 daltons, can be present). 
The term "isolated" as used herein refers to a nucleic acid or polypeptide 

20 separated from at least one other component (e.g., nucleic acid or polypeptide) present 
with the nucleic acid or polypeptide in its natural source. In one embodiment, the nucleic 
acid or polypeptide is found in the presence of (if anything) only a solvent, buffer, ion, or 
other component normally present in a solution of the same. The terms "isolated" and 
"purified" do not encompass nucleic acids or polypeptides present in their natural source. 

25 The term "recombinant," when used herein to refer to a polypeptide or protein, 

means that a polypeptide or protein is derived from recombinant (e.g., microbial, insect, 
or mammalian) expression systems. "Microbial" refers to recombinant polypeptides or 
proteins made in bacterial or fungal (e.g., yeast) expression systems. As a product, 
"recombinant microbial" defines a polypeptide or protein essentially free of native 

30 endogenous substances and unaccompanied by associated native glycosylation. 

Polypeptides or proteins expressed in most bacterial cultures, e.g., E. coli, will be free of 
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glycosylation modifications; polypeptides or proteins expressed in yeast will have a 
glycosylation pattern in general different from those expressed in mammalian cells. 

The term "recombinant expression vehicle or vector" refers to a plasmid or phage 
or virus or vector, for expressing a polypeptide from a DNA (RNA) sequence. An 
5 expression vehicle can comprise a transcriptional unit comprising an assembly of (1) a 
genetic element or elements having a regulatory role in gene expression, for example, 
promoters or enhancers, (2) a structural or coding sequence which is transcribed into 
mRNA and translated into protein, and (3) appropriate transcription initiation and 
termination sequences. Structural units intended for use in yeast or eukaryotic expression 

10 systems preferably include a leader sequence enabling extracellular secretion of 

translated protein by a host cell. Alternatively, where recombinant protein is expressed 
without a leader or transport sequence, it may include an amino terminal methionine 
residue. This residue may or may not be subsequently cleaved from the expressed 
recombinant protein to provide a final product. 

15 The term "recombinant expression system" means host cells which have stably 

integrated a recombinant transcriptional unit into chromosomal DNA or carry the 
recombinant transcriptional unit extrachromosomally. Recombinant expression systems 
as defined herein will express heterologous polypeptides or proteins upon induction of 
the regulatory elements linked to the DNA segment or synthetic gene to be expressed. 

20 This term also means host cells which have stably integrated a recombinant genetic 

element or elements having a regulatory role in gene expression, for example, promoters 
or enhancers. Recombinant expression systems as defined herein will express 
polypeptides or proteins endogenous to the cell upon induction of the regulatory elements 
linked to the endogenous DNA segment or gene to be expressed. The cells can be 

25 prokaryotic or eukaryotic. 

The term "secreted" includes a protein that is transported across or through a 
membrane, including transport as a result of signal sequences in its amino acid sequence 
when it is expressed in a suitable host cell. "Secreted" proteins include without limitation 
proteins secreted wholly (e.g., soluble proteins) or partially (e.g., receptors) from the cell 

30 in which they are expressed. "Secreted" proteins also include without limitation proteins 
that are transported across the membrane of the endoplasmic reticulum. "Secreted" 
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proteins are also intended to include proteins containing non-typical signal sequences 
(e.g. Interleukin-1 Beta, see Krasney, P.A. and Young, P.R. (1992) Cytokine 4(2): 134 
-143) and factors released from damaged cells (e.g. Interleukin-1 Receptor Antagonist, 
see Arend, W.P. et. aL (1998) Annu. Rev. Immunol. 16:27-55) 
5 Where desired, an expression vector may be designed to contain a "signal or 

leader sequence" which will direct the polypeptide through the membrane of a cell. Such 
a sequence may be naturally present on the polypeptides of the present invention or 
provided from heterologous protein sources by recombinant DNA techniques. 

Hie term "stringent" is used to refer to conditions that are commonly understood 

10 in the art as stringent. Stringent conditions can include highly stringent conditions (i.e., 
hybridization to filter-bound DNA in 0.5 M NaHP0 4 , 7% sodium dodecyl sulfate (SDS), 
1 mM EDTA at 65°C, and washing in 0.1X SSC/0.1% SDS at 68°C), and moderately 
stringent conditions (i.e., washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary 
hybridization conditions are described herein in the examples. 

15 In instances of hybridization of deoxyoligonucleotides, additional exemplary 

stringent hybridization conditions include washing in 6X SSC/0.05% sodium 
pyrophosphate at 37°C (for 14-base oligonucleotides), 48°C (for 17-base oligos), 55°C 
(for 20-base oligonucleotides), and 60°C (for 23-base oligonucleotides). 

As used herein, "substantially equivalent" or "substantially similar" can refer both 

20 to nucleotide and amino acid sequences, for example a mutant sequence, that varies from 
a reference sequence by one or more substitutions, deletions, or additions, the net effect 
of which does not result in an adverse functional dissimilarity between the reference and 
subject sequences. Typically, such a substantially equivalent sequence varies from one of 
those listed herein by no more than about 35% (ie.; the number of individual residue 

25 substitutions, additions, and/or deletions in a substantially equivalent sequence, as 
compared to the corresponding reference sequence, divided by the total number of 
residues in the substantially equivalent sequence is about 0.35 or less). Such a sequence 
is said to have 65% sequence identity to the listed sequence. In one embodiment, a 
substantially equivalent, e.g., mutant, sequence of the invention varies from a listed 

30 sequence by no more than 30% (70% sequence identity); in a variation of this 

embodiment, by no more than 25% (75% sequence identity); and in a further variation of 
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this embodiment, by no more than 20% (80% sequence identity) and in a further variation 
of this embodiment, by no more than 10% (90% sequence identity) and in a further 
variation of this embodiment, by no more that 5% (95% sequence identity). Substantially 
equivalent, e.g., mutant, amino acid sequences according to the invention preferably have 
5 at least 80% sequence identity with a listed amino acid sequence, more preferably at least 
85% sequence identity, more preferably at least 90% sequence identity, more preferably 
at least 95% sequence identity, more preferably at least 98% sequence identity, and most 
preferably at least 99% sequence identity. Substantially equivalent nucleotide sequence 
of the invention can have lower percent sequence identities, taking into account, for 

10 example, the redundancy or degeneracy of the genetic code. Preferably, the nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, 
more preferably at least about 80% sequence identity, more preferably at least 85% 
sequence identity, more preferably at least 90% sequence identity, moire preferably at 
least about 95% sequence identity, more preferably at least 98% sequence identity, and 

15 most preferably at least 99% sequence identity. For the purposes of the present 

invention, sequences having substantially equivalent biological activity and substantially 
equivalent expression characteristics are considered substantially equivalent For the 
purposes of determining equivalence, truncation of the mature sequence (e.g. 9 via a 
mutation which creates a spurious stop codon) should be disregarded. Sequence identity 

20 may be determined, e.g., using the Jotun Hein method (Hein, J. (1990) Methods 

Enzymol. 183:626-645). Identity between sequences can also be determined by other 
methods known in the art, e.g. by varying hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of 
the cell types of an adult organism. 

25 The term "transformation" means introducing DNA into a suitable host cell so 

that the DNA is replicable, either as an extrachromosomal element, or by chromosomal 
integration. The term "transfection" refers to the taking up of an expression vector by a 
suitable host cell, whether or not any coding sequences are in fact expressed. The term 
"infection" refers to the introduction of nucleic acids into a suitable host cell by use of a 

30 virus or viral vector. 
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As used herein, an "uptake modulating fragment," UMF, means a series of 
nucleotides which mediate the uptake of a linked DNA fragment into a cell. UMFs can 
be readily identified using known UMFs as a target sequence or target motif with the 
computer-based systems described below. The presence and activity of a UMF can be 
5 confirmed by attaching the suspected UMF to a marker sequence. The resulting nucleic 
acid molecule is then incubated with an appropriate host under appropriate conditions and 
the uptake of the marker sequence is determined. As described above, a UMF will 
increase the frequency of uptake of a linked marker sequence. 

Each of the above terms is meant to encompass all that is described for each, 
10 unless the context dictates otherwise. 

4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention aie set forth in the Sequence listing. 

The isolated polynucleotides of the invention include a polynucleotide comprising 

15 the nucleotide sequences of SEQ ID NO: 1 - 438; a polynucleotide encoding any one of 
the peptide sequences of SEQ ID NO:l - 438; and a polynucleotide comprising the 
nucleotide sequence encoding the mature protein coding sequence of the polynucleotides 
of any one of SEQ ID NO: 1 - 438. The polynucleotides of the present invention also 
include, but are not limited to, a polynucleotide that hybridizes under stringent conditions 

20 to (a) the complement of any of the nucleotides sequences of SEQ ID NO: 1 - 438; (b) 
nucleotide sequences encoding any one of the amino acid sequences set forth in the 
Sequence Listing; (c) a polynucleotide which is an allelic variant of any polynucleotide 
recited above; (d) a polynucleotide which encodes a species homolog of any of the 
proteins recited above; or (e) a polynucleotide that encodes a polypeptide comprising a 

25 specific domain or truncation of the polypeptides of SEQ ID NO: 1- 438. Domains of 
interest may depend on the nature of the encoded polypeptide; e.g., domains in receptor- 
like polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic 
domains, or combinations thereof; domains in immunoglobulin-like proteins include the 
variable immunoglobulin-like domains; domains in enzyme-like polypeptides include 

30 catalytic and substrate binding domains; and domains in ligand polypeptides include 
receptor-binding domains. 
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The polynucleotides of the invention include naturally occurring or wholly or 
partially synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The 
polynucleotides may include all of the coding region of the cDNA or may represent a 
portion of the coding region of the cDNA. 
5 The present invention also provides genes corresponding to the cDNA sequences 

disclosed herein. The corresponding genes can be isolated in accordance with known 
methods using the sequence information disclosed herein. Such methods include the 
preparation of probes or primers from the disclosed sequence information for identification 
and/or amplification of genes in appropriate genomic libraries or other sources of genomic 

10 materials. Further 5' and 3' sequence can be obtained using methods known in the art For 
example, full length cDNA or genomic DNA that corresponds to any of the polynucleotides 
of SEQ ID NO: 1 - 438 can be obtained by screening appropriate cDNA or genomic DNA 
libraries under suitable hybridization conditions using any of the polynucleotides of SEQ ID 
NO: 1 - 438 or a portion thereof as a probe. Alternatively, the polynucleotides of SEQ ID 

15 NO: 1 - 438 may be used as the basis for suitable primer(s) that allow identification and/or 
amplification of genes in appropriate genomic DNA or cDNA libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and 
sequences (including cDNA and genomic sequences) obtained from one or more public 
databases, such as dbEST, gbpri, and UniGene. The EST sequences can provide identifying 

20 sequence information, representative fragment or segment information, or novel segment 
information for the full-length gene. 

The polynucleotides of the invention also provide polynucleotides including 
nucleotide sequences that are substantially equivalent to the polynucleotides recited 
above. Polynucleotides according to the invention can have, e.g., at least about 65%, at 

25 least about 70%, at least about 75%, at least about 80%, 81%, 82%, 83%, 84%, more 
typically at least about 85%, 86%, 87%, 88%, 89%, more typically at least about 90%, 
91%, 92%, 93%, 94%, and even more typically at least about 95%, 96%, 97%, 98%, 99% 
sequence identity to a polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are 

30 nucleic acid sequence fragments that hybridize under stringent conditions to any of the 
nucleotide sequences of SEQ ID NO: 1 - 438, or complements thereof, which fragment is 
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greater than about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 
nucleotides and most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 
20 nucleotides or more that are selective for (i.e. specifically hybridize to any one of the 
polynucleotides of the invention) are contemplated. Probes capable of specifically 
5 hybridizing to a polynucleotide can differentiate polynucleotide sequences of the 
invention from other polynucleotide sequences in the same family of genes or can 
differentiate human genes from genes of other species, and are preferably based on 
unique nucleotide sequences. 

The sequences falling within the scope of the present invention are not limited to 

10 these specific sequences, but also include allelic and species variations thereof. Allelic and 
species variations can be routinely determined by comparing the sequence provided in SEQ 
ID NO: 1 - 438, a representative fragment thereof, or a nucleotide sequence at least 90% 
identical, preferably 95% identical, to SEQ ID NOs: 1 - 438 with a sequence from another 
isolate of the same species. Furthermore, to accommodate codon variability, the invention 

15 includes nucleic acid molecules coding for the same amino acid sequences as do the specific 
ORFs disclosed herein. In other words, in the coding region of an ORF, substitution of one 
codon for another codon that encodes the same amino acid is expressly contemplated. 

The nearest neighbor or homology result for the nucleic acids of the present 
invention, including SEQ ID NOs: 1 - 438, can be obtained by searching a database using an 

20 algorithm or a program. Preferably, a BLAST which stands for Basic Local Alignment 
Search Tool is used to search for local sequence alignments (Altshul, S F. J Mol. Evol. 36 
290-300 (1993) and Altschul SP. et al. J. Mol. Biol. 21:403^10 (1990)). Alternatively a 
FASTA version 3 search against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are 

25 also provided by the present invention. Species homologs may be isolated and identified 
by making suitable probes or primers from the sequences provided herein and screening a 
suitable nucleic acid source from the desired species. 

The invention also encompasses allelic variants of the disclosed polynucleotides 
or proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide 

30 which also encode proteins which are identical, homologous or related to that encoded by 
the polynucleotides. 
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The nucleic acid sequences of the invention are further directed to sequences 
which encode variants of the described nucleic acids. These amino acid sequence 
variants may be prepared by methods known in the art by introducing appropriate 
nucleotide changes into a native or variant polynucleotide. There are two variables in the 
5 construction of amino acid sequence variants: the location of the mutation and the nature 
of the mutation. Nucleic acids encoding the amino acid sequence variants are preferably 
constructed by mutating the polynucleotide to encode an amino acid sequence that does 
not occur in nature. These nucleic acid alterations can be made at sites that differ in the 
nucleic acids from different species (variable positions) or in highly conserved regions 

10 (constant regions). Sites at such locations will typically be modified in series, e. g., by 
substituting first with conservative choices (e.g., hydrophobic amino acid to a different 
hydrophobic amino acid) and then with more distant choices (e.g., hydrophobic amino 
acid to a charged amino acid), and then deletions or insertions may be made at the target 
site. Amino acid sequence deletions generally range from about 1 to 30 residues, 

15 preferably about 1 to 10 residues, and are typically contiguous. Amino acid insertions 
include amino- and/or carboxyl-terminal fusions ranging in length from one to one 
hundred or more residues, as well as intrasequence insertions of single or multiple amino 
acid residues. Intrasequence insertions may range generally from about 1 to 10 amino 
residues, preferably from 1 to 5 residues. Examples of terminal insertions include the 

20 heterologous signal sequences necessary for secretion or for intracellular targeting in 
different host cells and sequences such as FLAG or poly-histidine sequences useful for 
purifying the expressed protein. 

In a preferred method, polynucleotides encoding the novel amino acid sequences 
are changed via site-directed mutagenesis. This method uses oligonucleotide sequences 

25 to alter a polynucleotide to encode the desired amino acid variant, as well as sufficient 
adjacent nucleotides on both sides of the changed amino acid to form a stable duplex on 
either side of the site of being changed. In general, the techniques of site-directed 
mutagenesis are well known to those of skill in the art and this technique is exemplified 
by publications such as, Edelman et al., DNA 2:183 (1983). A versatile and efficient 

30 method for producing site-specific changes in a polynucleotide sequence was published 
by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 (1982). PCR may also be used to 
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create amino acid sequence variants of the novel nucleic acids. When small amounts of 
template DNA are used as starting material, primer(s) that differs slightly in sequence 
from the corresponding region in the template DNA can generate the desired amino acid 
variant. PCR amplification results in a population of product DNA fragments that differ 
5 from the polynucleotide template encoding the polypeptide at the position specified by 
the primer. The product DNA fragments replace the corresponding region in the plasmid 
and this gives a polynucleotide encoding the desired amino acid variant. 

A further technique for generating amino acid variants is the cassette mutagenesis 
technique described in Wells et al., Gene 34:3 15 (1985); and other mutagenesis 

10 techniques well known in the art, such as, for example, the techniques in Sambrook et al., 
supra, and Current Protocols in Molecular Biology, Ausubel et al. Due to the inherent 
degeneracy of the genetic code, other DNA sequences which encode substantially the 
same or a functionally equivalent amino acid sequence may be used in the practice of the 
invention for the cloning and expression of these novel nucleic acids. Such DNA 

15 sequences include those which are capable of hybridizing to the appropriate novel nucleic 
acid sequence under stringent conditions. 

Polynucleotides encoding preferred polypeptide truncations of the invention can 
be used to generate polynucleotides encoding chimeric or fusion proteins comprising one 
or more domains of the invention and heterologous protein sequences. 

20 The polynucleotides of the invention additionally include the complement of any 

of the polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, 
amplified, or synthetic) or RNA. Methods and algorithms for obtaining such 
polynucleotides are well known to those of skill in the art and can include, for example, 
methods for determining hybridization conditions that can routinely isolate 

25 polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the 
mature protein coding sequences corresponding to any one of SEQ ID NO: 1-438, or 
functional equivalents thereof, may be used to generate recombinant DNA molecules that 
direct the expression of that nucleic acid, or a functional equivalent thereof, in 

30 appropriate host cells. Also included are the cDNA inserts of any of the clones identified 
herein. 
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A polynucleotide according to the invention can be joined to any of a variety of 
other nucleotide sequences by well-established recombinant DNA techniques (see 
Sambrook J et al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY). Useful nucleotide sequences for joining to polynucleotides include an 
5 assortment of vectors, e.g., plasmids, cosmids, lambda phage derivatives, phagemids, and 
the like, that are well known in the art Accordingly, the invention also provides a vector 
including a polynucleotide of the invention and a host cell containing the polynucleotide. 
In general, the vector contains an origin of replication functional in at least one organism, 
convenient restriction endonuclease sites, and a selectable marker for the host cell. 

10 Vectors according to the invention include expression vectors, replication vectors, probe 
generation vectors, and sequencing vectors. A host cell according to the invention can be 
a prokaryotic or eukaryotic cell and can be a unicellular organism or part of a 
multicellular organism. 

The present invention further provides recombinant constructs comprising a 

15 nucleic acid having any of the nucleotide sequences of SEQ ID NOs: 1 - 438 or a 

fragment thereof or any other polynucleotides of the invention. In one embodiment, the 
recombinant constructs of the present invention comprise a vector, such as a plasmid or 
viral vector, into which a nucleic acid having any of the nucleotide sequences of SEQ ID 
NOs: 1 - 438 or a fragment thereof is inserted, in a forward or reverse orientation. In the 

20 case of a vector comprising one of the ORFs of the present invention, the vector may 
further comprise regulatory sequences, including for example, a promoter, operably 
linked to the ORF. Large numbers of suitable vectors and promoters are known to those 
of skill in the art and are commercially available for generating the recombinant 
constructs of the present invention. The following vectors are provided by way of 

25 example. Bacterial: pBs, phagescript, PsiX174, pBluescript SK, pBs KS, pNH8a, 
pNH16a, pNH18a, pNH46a (Stratagene); pTrc99A, pKK223-3, pKK233-3, pDR540, 
pRTT5 (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, PXT1, pSG (Stratagene) 
pSVK3, pBPV, pMSG, pSVL (Pharmacia). 

The isolated polynucleotide of the invention may be operably linked to an 

30 expression control sequence such as the pMT2 or pED expression vectors disclosed in 
Kaufman et al., Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein 
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recombinantly. Many suitable expression control sequences are known in the art. 
General methods of expressing recombinant proteins are also known and are exemplified 
in R. Kaufman, Methods in Enzymology 185, 537-566 (1990). As defined herein 
"operably linked" means that the isolated polynucleotide of the invention and an 
5 expression control sequence are situated within a vector or cell in such a way that the 
protein is expressed by a host cell which has been transformed (transfected) with the 
ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT 
(chloramphenicol transferase) vectors or other vectors with selectable markers. Two 

10 appropriate vectors are pKK232-8 and pCM7. Particular named bacterial promoters 
include lad, lacZ, T3, T7, gpt, lambda PR, and trc. Eukaryotic promoters include CMV 
immediate early, HSV thymidine kinase, early and late S V40, LTRs from retrovirus, and 
mouse metallothionein-I. Selection of the appropriate vector and promoter is well within 
the level of ordinary skill in the art. Generally, recombinant expression vectors will 

15 include origins of replication and selectable markers permitting transformation of the host 
cell, e.g., the ampicillin resistance gene of E. coli and S. cerevisiae TRP1 gene, and a 
promoter derived from a highly-expressed gene to direct transcription of a downstream 
structural sequence. Such promoters can be derived from operons encoding glycolytic 
enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid phosphatase, or heat 

20 shock proteins, among others. The heterologous structural sequence is assembled in 

appropriate phase with translation initiation and termination sequences, and preferably, a 
leader sequence capable of directing secretion of translated protein into the periplasmic 
space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 

25 characteristics, e.g., stabilization or simplified purification of expressed recombinant 
product. Useful expression vectors for bacterial use are constructed by inserting a 
structural DNA sequence encoding a desired protein together with suitable translation 
initiation and termination signals in operable reading phase with a functional promoter. 
The vector will comprise one or more phenotypic selectable markers and an origin of 

30 replication to ensure maintenance of the vector and to, if desirable, provide amplification 
within the host. Suitable prokaryotic hosts for transformation include E. coli, Bacillus 
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siibtilis, Salmonella typhimurium and various species within the genera Pseudomonas, 
Streptomyces, and Staphylococcus, although others may also be employed as a matter of 
choice. 

As a representative but non-limiting example, useful expression vectors for 
5 bacterial use can comprise a selectable marker and bacterial origin of replication derived 
from commercially available plasmids comprising genetic elements of the well known 
cloning vector pBR322 (ATCC 37017). Such commercial vectors include, for example, 
pKK223-3 (Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (Promega 
Biotech, Madison, WI, USA). These pBR322 "backbone" sections are combined with an 

10 appropriate promoter and the structural sequence to be expressed Following 

transformation of a suitable host strain and growth of the host strain to an appropriate cell 
density, the selected promoter is induced or derepressed by appropriate means (e.g., 
temperature shift or chemical induction) and cells are cultured for an additional period. 
Cells are typically harvested by centrifugation, disrupted by physical or chemical means, 

1 5 and the resulting crude extract retained for further purification. 

Polynucleotides of the invention can also be used to induce immune responses. 
For example, as described in Fan et al., Nat Biotech. 17:870-872 (1999), incorporated 
herein by reference, nucleic acid sequences encoding a polypeptide may be used to 
generate antibodies against the encoded polypeptide following topical administration of 

20 naked plasmid DNA or following injection, and preferably intra-muscular injection of the 
DNA. The nucleic acid sequences are preferably inserted in a recombinant expression 
vector and may be in the form of naked DNA. 

43 ANTISENSE 

25 Another aspect of the invention pertains to isolated antisense nucleic acid 

molecules that are hybridizable to or complementary to the nucleic acid molecule 
comprising the nucleotide sequence of SEQ ID NO: 1 - 438, or fragments, analogs or 
derivatives thereof. An "antisense" nucleic acid comprises a nucleotide sequence that is 
complementary to a "sense" nucleic acid encoding a protein, e.g., complementary to the 

30 coding strand of a double-stranded cDNA molecule or complementary to an mRNA 
sequence. In specific aspects, antisense nucleic acid molecules are provided that 
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comprise a sequence complementary to at least about 10, 25, 50, 100, 250 or 500 
nucleotides or an entire coding strand, or to only a portion thereof. Nucleic acid 
molecules encoding fragments, homologs, derivatives and analogs of a protein of any of 
SEQ ID NO: 1 - 438 or antisense nucleic acids complementary to a nucleic acid sequence 
5 of SEQ ID NO: 1 - 438 are additionally provided. 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding 
region" of the coding strand of a nucleotide sequence of the invention. The term "coding 
region" refers to the region of the nucleotide sequence comprising codons which are 
translated into amino acid residues. In another embodiment, the antisense nucleic acid 

10 molecule is antisense to a "noncoding region" of the coding strand of a nucleotide 

sequence of the invention. The term "noncoding region" refers to 5 1 and 3' sequences that 
flank the coding region that are not translated into amino acids (i.e., also referred to as 5' 
and 3' untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., 

15 SEQ ID NO: 1 - 438, antisense nucleic acids of the invention can be designed according 
to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid 
molecule can be complementary to the entire coding region of an mRNA, but more 
preferably is an oligonucleotide that is antisense to only a portion of the coding or 
noncoding region of an mRNA. For example, the antisense oligonucleotide can be 

20 complementary to the region surrounding the translation start site of an mRNA. An 
antisense oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 
50 nucleotides in length. An antisense nucleic acid of the invention can be constructed 
using chemical synthesis or enzymatic ligation reactions using procedures known in the 
art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be 

25 chemically synthesized using naturally occurring nucleotides or variously modified 
nucleotides designed to increase the biological stability of the molecules or to increase 
the physical stability of the duplex formed between the antisense and sense nucleic acids, 
e.g. 7 phosphorothioate derivatives and acridine substituted nucleotides can be used. 
Examples of modified nucleotides that can be used to generate the antisense 

30 nucleic acid include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, 
hypoxanthine, xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 
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5^aiboxymethylaminomethyl-2-thiouridine, 5^arboxymethylaminomethyluracil, 
dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 

1- methylguanine, 1-methylinosine, 2,2--dimethylguanine, 2-methyladenine, 

2- methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 7-methylguanine, 
5 5-methylaminomethyltiracil, 5-methoxyaminomethyl-2-thiouracil, 

beta-D-mannosylqueosine, 5-methoxycarboxymethyluracil, 5-methoxyuracil, 
2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, 
pseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 
5-methyluracil, uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 

10 5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 

2,6-diaminopurine. Alternatively, the antisense nucleic acid can be produced biologically 
using an expression vector into which a nucleic acid has been subcloned in an antisense 
orientation RNA transcribed from the inserted nucleic acid will be of an antisense 
orientation to a target nucleic acid of interest, described further in the following 

15 subsection). 

The antisense nucleic acid molecules of the invention are typically administered 
to a subject or generated in situ such that they hybridize with or bind to cellular mRNA 
and/or genomic DNA encoding a protein according to the invention to thereby inhibit 
expression of the protein, e.g., by inhibiting transcription and/or translation. The 

20 hybridization can be by conventional nucleotide complementarity to form a stable duplex, 
or, for example, in the case of an antisense nucleic acid molecule that binds to DNA 
duplexes, through specific interactions in the major groove of the double helix. An 
example of a route of administration of antisense nucleic acid molecules of the invention 
includes direct injection at a tissue site. Alternatively, antisense nucleic acid molecules 

25 can be modified to target selected cells and then administered systemically. For example, 
for systemic administration, antisense molecules can be modified such that they 
specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by 
linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell 
surface receptors or antigens. The antisense nucleic acid molecules can also be delivered 

30 to cells using the vectors described herein. To achieve sufficient intracellular 

concentrations of antisense molecules, vector constructs in which the antisense nucleic 
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acid molecule is placed under the control of a strong pol II or pol DDL promoter are 
preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is 
an a-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms 
5 specific double-stranded hybrids with complementary RNA in which, contrary to the usual 
a-units, the strands nm parallel to each other (Gaultier et al. (1987) Nucleic Acids Res 15: 
6625-6641). The antisense nucleic acid molecule can also comprise a 
Z-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res 15: 6131-6148) or a 
chimeric RNA -DNA analogue (Inoue et al. (1987) FEBS Lett 215: 327-330). 

10 

4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a 
ribozyme. Ribozymes are catalytic RNA molecules with ribonuclease activity that are 
capable of cleaving a single-stranded nucleic acid, such as an mRNA, to which they have 

15 a complementary region. Thus, ribozymes (e.g. , hammerhead ribozymes (described in 
Haselhoff and Gerlach (1988) Nature 334:585-591)) can be used to catalytically cleave 
mRNA transcripts to thereby inhibit translation of an mRNA. A ribozyme having 
specificity for a nucleic acid of the invention can be designed based upon the nucleotide 
sequence of a DNA disclosed herein (Le., SEQ ID NO: 1 - 438). For example, a 

20 derivative of Tetrahymena L19 TVS RNA can be constructed in which the nucleotide 

sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 
SECX-encodingmRNA. See, e.g., Cech etal. U.S. Pat No. 4,987,071; and Cech et al. 
U.S. Pat. No. 5,116,742. Alternatively, SECX mRNA can be used to select a catalytic 
RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., 

25 Bartel et al., (1993) Science 261:1411-1418. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 
complementary to the regulatory region (e.g., promoter and/or enhancers) to form triple 
helical structures that prevent transcription of the gene in target cells. See generally, 
Helene. (1991) Anticancer Drug Des. 6: 569-84; Helene. et al. (1992) Ami. N.Y. Acad 

30 Sci. 660:27-36; and Maher (1992) Bioassays 14: 807-15. 
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In various embodiments, the nucleic acids of the invention can be modified at the 
base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, 
hybridization, or solubility of the molecule. For example, the deoxyribose phosphate 
backbone of the nucleic acids can be modified to generate peptide nucleic acids (see 
5 Hyrup et al. (1996) Bioorg Med Chem 4: 5-23). As used herein, the terms "peptide 
nucleic acids" or "PNAs" refer to nucleic acid mimics, e.g., DNA mimics, in which the 
deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the 
four natural nucleobases are retained. The neutral backbone of PNAs has been shown to 
allow for specific hybridization to DNA and RNA under conditions of low ionic strength. 

10 The synthesis of PNA oligomers can be performed using standard solid phase peptide 
synthesis protocols as described in Hyrup et al. (1996) above; Perry-O'Keefe et al. (1996) 
PNAS 93: 14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 
example, PNAs can be used as antisense or antigene agents for sequence-specific 

15 modulation of gene expression by, e.g., inducing transcription or translation arrest or 
inhibiting replication. PNAs of the invention can also be used, e.g., in the analysis of 
single base pair mutations in a gene by, e.g., PNA directed PCR clamping; as artificial 
restriction enzymes when used in combination with other enzymes, e.g., SI nucleases 
(Hyrup B. (1996) above); or as probes or primers for DNA sequence and hybridization 

20 (Hyrup et al (1996), above; Perry-O'Keefe (1996), above). 

In another embodiment, PNAs of the invention can be modified, e.g., to enhance 
their stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by 
the formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of 
drug delivery known in the art. For example, PNA-DNA chimeras can be generated that 

25 may combine the advantageous properties of PNA and DNA. Such chimeras allow DNA 
recognition enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA 
portion while the PNA portion would provide high binding affinity and specificity. 
PNA-DNA chimeras can be linked using linkers of appropriate lengths selected in terms 
of base stacking, number of bonds between the nucleobases, and orientation (Hyrup 

30 (1996) above). The synthesis of PNA-DNA chimeras can be performed as described in 
Hyrup (1996) above and Finn et al. (1996) Nucl Acids Res 24: 3357-63. For example, a 
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DNA chain can be synthesized on a solid support using standard phosphoramidite 
coupling chemistry, and modified nucleoside analogs, e.g., 

S'-C^methoxytritylJamino-S^eoxy-thymidine phosphoramidite, can be used between the 
PNA and the 5' end of DNA (Mag et al. (1989) Nucl Acid Res 17: 5973-88). PNA 
5 monomers are then coupled in a stepwise manner to produce a chimeric molecule with a 
5' PNA segment and a 3' DNA segment (Finn et al. (1996) above). Alternatively, 
chimeric molecules can be synthesized with a 5' DNA segment and a 3' PNA segment. 
See, Petersen et al. (1975) Bioorg Med Chem Lett 5: 1 1 19-1 1 124. 

In other embodiments, the oligonucleotide may include other appended groups 

10 such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating 

transport across the cell membrane (see, e.g., Letsinger et al., 1989, Proc. Natl. Acad. Sci. 
U.S.A 86:6553-6556; Lemaitre etal, 1987, Proc. Natl Acad. Sci. 84:648-652; PCT 
Publication No. W088/09810) or the blood-brain barrier (see, e.g., PCT Publication No. 
W089/10134). In addition, oligonucleotides can be modified with hybridization triggered 

15 cleavage agents (See, e.g., Krol et al, 1988, BioTechniques 6:958-976) or intercalating 
agents. (See, e.g., Zon, 1988, Pharm. Res. 5: 539-549). To this end, the oligonucleotide 
may be conjugated to another molecule, e.g., a peptide, a hybridization triggered 
cross-linking agent, a transport agent, a hybridization-triggered cleavage agent, etc. 



20 4.5 HOSTS 

The present invention further provides host cells genetically engineered to contain 
the polynucleotides of the invention. For example, such host cells may contain nucleic 
acids of the invention introduced into the host cell using known transformation, 
transfection or infection methods. The present invention still further provides host cells 

25 genetically engineered to express the polynucleotides of the invention, wherein such 
polynucleotides are in operative association with a regulatory sequence heterologous to 
the host cell which drives expression of the polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, 
or increase, expression of endogenous polypeptide. Cells can be modified (e.g., by 

30 homologous recombination) to provide increased polypeptide expression by replacing, in 
whole or in part, the naturally occurring promoter with all or part of a heterologous 
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promoter so that the cells express the polypeptide at higher levels. The heterologous 
promoter is inserted in such a manner that it is operatively linked to the encoding 
sequences. See, for example, PCT International Publication No. WO94/12650, PCT 
International Publication No. WO92/20808, and PCT International Publication No. 
5 W09 1/09955. It is also contemplated that, in addition to heterologous promoter DNA, 
amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) 
and/or intron DNA may be inserted along with the heterologous promoter DNA. If 
linked to the coding sequence, amplification of the marker DNA by standard selection 

10 methods results in co-amplification of the desired protein coding sequences in the cells. 

The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a 
lower eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, 
such as a bacterial cell. Introduction of the recombinant construct into the host cell can 
be effected by calcium phosphate transfection, DEAE, dextran mediated transfection, or 

15 electroporation (Davis, L. et al., Basic Methods in Molecular Biology (1986)). The host 
cells containing one of the polynucleotides of the invention, can be used in conventional 
manners to produce the gene product encoded by the isolated fragment (in the case of an 
ORF) or can be used to produce a heterologous protein under the control of the EMR 
Any host/vector system can be used to express one or more of the ORFs of the 

20 present invention. These include, but are not limited to, eukaryotic hosts such as HeLa 
cells, Cv-1 cell, COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. 
coli and B. subtilis. The most preferred cells are those which do not normally express the 
particular polypeptide or protein or which expresses the polypeptide or protein at low 
natural level. Mature proteins can be expressed in mammalian cells, yeast, bacteria, or 

25 other cells under the control of appropriate promoters. Cell-free translation systems can 
also be employed to produce such proteins using RNAs derived from the DNA constructs 
of the present invention. Appropriate cloning and expression vectors for use with 
prokaryotic and eukaryotic hosts are described by Sambrook, et al., in Molecular 
Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New York (1989), 

30 the disclosure of which is hereby incorporated by reference. 
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Various mammalian cell culture systems can also be employed to express 
recombinant protein. Examples of mammalian expression systems include the COS-7 
lines of monkey kidney fibroblasts, described by Gluzman, Cell 23:175 (1981). Other 
cell lines capable of expressing a compatible vector are, for example, the CI 27, monkey 
5 COS cells, Chinese Hamster Ovary (CHO) cells, human kidney 293 cells, human 
epidermal A431 cells, human Colo205 cells, 3T3 cells, CV-1 cells, other transformed 
primate cell lines, normal diploid cells, cell strains derived from in vitro culture of 
primary tissue, primary explants, HeLa cells, mouse L cells, BHK, HL-60, U937, HaK or 
Jurkat cells. Mammalian expression vectors will comprise an origin of replication, a 

10 suitable promoter and also any necessary ribosome binding sites, polyadenylation site, 
splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for 
example, S V40 origin, early promoter, enhancer, splice, and polyadenylation sites may be 
used to provide the required nontranscribed genetic elements. Recombinant polypeptides 

15 and proteins produced in bacterial culture are usually isolated by initial extraction from 
cell pellets, followed by one or more salting-out, aqueous ion exchange or size exclusion 
chromatography steps. Protein refolding steps can be used, as necessary, in completing 
configuration of the mature protein. Finally, high performance liquid chromatography 
(HPLC) can be employed for final purification steps. Microbial cells employed in 

20 expression of proteins can be disrupted by any convenient method, including freeze-thaw 
cycling, sonication, mechanical disruption, or use of cell lysing agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such 
as yeast or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains 
include Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, 

25 Candida, or any yeast strain capable of expressing heterologous proteins. Potentially 
suitable bacterial strains include Escherichia coli, Bacillus subtilis, Salmonella 
typhimurium, or any bacterial strain capable of expressing heterologous proteins. If the 
protein is made in yeast or bacteria, it may be necessary to modify the protein produced 
therein, for example by phosphorylation or glycosylation of the appropriate sites, in order 

30 to obtain the functional protein. Such covalent attachments may be accomplished using 
known chemical or enzymatic methods. 
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In another embodiment of the present invention, cells and tissues may be 
engineered to express an endogenous gene comprising the polynucleotides of the 
invention under the control of inducible regulatory elements, in which case the regulatory 
sequences of the endogenous gene may be replaced by homologous recombination. As 
5 described herein, gene targeting can be used to replace a gene's existing regulatory region 
with a regulatory sequence isolated from a different gene or a novel regulatory sequence 
synthesized by genetic engineering methods. Such regulatory sequences may be 
comprised of promoters, enhancers, scaffold-attachment regions, negative regulatory 
elements, transcriptional initiation sites, regulatory protein binding sites or combinations 

10 of said sequences. Alternatively, sequences which affect the .structure or stability of the 
RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, 
splice sites, leader sequences for enhancing or modifying transport or secretion properties 
of the protein, or other sequences which alter or improve the function or stability of 

15 protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing 
the gene under the control of the new regulatory sequence, e.g., inserting a new promoter 
or enhancer or both upstream of a gene. Alternatively, the targeting event may be a 
simple deletion of a regulatory element, such as the deletion of a tissue-specific negative 

20 regulatory element. Alternatively, the targeting event may replace an existing element; 
for example, a tissue-specific enhancer can be replaced by an enhancer that has broader 
or different cell-type specificity than the naturally occurring elements. Here, the 
naturally occurring sequences are deleted and new sequences are added. In all cases, the 
identification of the targeting event may be facilitated by the use of one or more 

25 selectable marker genes that are contiguous with the targeting DNA, allowing for the 
selection of cells in which the exogenous DNA has integrated into the host cell genome. 
The identification of the targeting event may also be facilitated by the use of one or more 
marker genes exhibiting the property of negative selection, such that the negatively 
selectable marker is linked to the exogenous DNA, but configured such that the 

30 negatively selectable marker flanks the targeting sequence, and such that a correct 

homologous recombination event with sequences in the host cell genome does not result 
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10 



in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Heipes Simplex Virus thymidine kinase (TK) gene or the bacterial 
xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance 
with this aspect of the invention are more particularly described in U.S. Patent No. 
5,272,071 to Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; International 
Application No. PCIYUS92/09627 (WO93/09222) by Selden et al.; and International 
Application No. PCT/US90/06436 (W09 1/06667) by Skoultchi et al., each of which is 
incorporated by reference herein in its entirety. 



4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a 
polypeptide comprising: the amino acid sequences set forth as any one of SEQ ID NO: 1- 
438 or an amino acid sequence encoded by any one of the nucleotide sequences SEQ ID 

15 NOs: 1 - 438 or the corresponding full length or mature protein. Polypeptides of the 

invention also include polypeptides preferably with biological or immunological activity 
that are encoded by: (a) a polynucleotide having any one of the nucleotide sequences set 
forth in SEQ ID NOs: 1 - 438 or (b) polynucleotides encoding any one of the amino acid 
sequences set forth as SEQ ID NO: 1-438 or (c) polynucleotides that hybridize to the 

20 complement of the polynucleotides of either (a) or (b) under stringent hybridization 
conditions. The invention also provides biologically active or immunologically active 
variants of any of the amino acid sequences set forth as SEQ ID NO: 1-438 or the 
corresponding full length or mature protein; and "substantial equivalents" thereof (e.g., 
with at least about 65%, at least about 70%, at least about 75%, at least about 80%, at 

25 least about 85%, 86%, 87%, 88%, 89%, at least about 90%, 91%, 92%, 93%, 94%, 

typically at least about 95%, 96%, 97%, more typically at least about 98%, or most 

i 

typically at least about 99% amino acid identity) that retain biological activity. 
Polypeptides encoded by allelic variants may have a similar, increased, or decreased 
activity compared to polypeptides comprising SEQ ID NO: 1-438. 
30 Fragments of the proteins of the present invention which are capable of exhibiting 

biological activity are also encompassed by the present invention. Fragments of the 
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protein may be in linear form or they may be cyclized using known methods, for 
example, as described in H. U. Saragovi, et ah, Bio/Technology 10, 773-778 (1992) and 
in R. S. McDowell, et aL, J. Amer. Chem. Soc. 114, 9245-9253 (1992), both of which are 
incoiporated herein by reference. Such fragments may be fused to carrier molecules such 
5 as immunoglobulins for many purposes, including increasing the valency of protein 
binding sites. 

The present invention also provides both full-length and mature forms (for 
example, without a signal sequence or precursor sequence) of the disclosed proteins. The 
protein coding sequence is identified in the sequence listing by translation of the 

10 disclosed nucleotide sequences. The mature form of such protein may be obtained by 
expression of a full-length polynucleotide in a suitable mammalian cell or other host cell. 
The sequence of the mature form of the protein is also determinable from the amino acid 
sequence of the full-length form. Where proteins of the present invention are membrane 
bound, soluble forms of the proteins are also provided. In such forms, part or all of the 

15 regions causing the proteins to be membrane bound are deleted so that the proteins are 
fully secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable 
carrier, such as a hydrophilic, *.g., pharmaceutically acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the 

20 nucleic acid fragments of the present invention or by degenerate variants of the nucleic 
acid fragments of the present invention. By "degenerate variant" is intended nucleotide 
fragments which differ from a nucleic acid fragment of the present invention (e.g., an 
ORF) by nucleotide sequence but, due to the degeneracy of the genetic code, encode an 
identical polypeptide sequence. Preferred nucleic acid fragments of the present invention 

25 are the ORFs that encode proteins. 

A variety of methodologies known in the art can be utilized to obtain any one of 
the isolated polypeptides or proteins of the present invention. At the simplest level, the 
amino acid sequence can be synthesized using commercially available peptide 
synthesizers. The synthetically-constructed protein sequences, by virtue of sharing 

30 primary, secondary or tertiary structural and/or conformational characteristics with 
proteins may possess biological properties in common therewith, including protein 
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activity. This technique is particularly useful in producing small peptides and fragments 
of larger polypeptides. Fragments are useful, for example, in generating antibodies 
against the native polypeptide. Thus, they may be employed as biologically active or 
immunological substitutes for natural, purified proteins in screening of therapeutic 
5 compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be 
purified from cells which have been altered to express the desired polypeptide or protein. 
As used herein, a cell is said to be altered to express a desired polypeptide or protein 
when the cell, through genetic manipulation, is made to produce a polypeptide or protein 
10 which it nonnally does not produce or which the cell normally produces at a lower level. 
One skilled in the art can readily adapt procedures for introducing and expressing either 
recombinant or synthetic sequences into eukaryotic or prokaryotic cells in order to 
generate a cell which produces one of the polypeptides or proteins of the present 
invention. 

15 The invention also relates to methods for producing a polypeptide comprising 

growing a culture of host cells of the invention in a suitable culture medium, and 
purifying the protein from the cells or the culture in which the cells are grown. For 
example, the methods of the invention include a process for producing a polypeptide in 
which a host cell containing a suitable expression vector that includes a polynucleotide of 

20 the invention is cultured under conditions that allow expression of the encoded 

polypeptide. The polypeptide can be recovered from the culture, conveniendy from the 
culture medium, or from a lysate prepared from the host cells and further purified 
Preferred embodiments include those in which the protein produced by such process is a 
full length or mature form of the protein. 

25 In an alternative method, the polypeptide or protein is purified from bacterial 

cells which naturally produce the polypeptide or protein. One skilled in the art can 
readily follow known methods for isolating polypeptides and proteins in order to obtain 
one of the isolated polypeptides or proteins of the present invention. These include, but 
are not limited to, immunochromatography, HPLC, size-exclusion chromatography, 

30 ion-exchange chromatography, and immuno-affinity chromatography. See, e.g., Scopes, 
Protein Purification: Principles and Practice, Springer-Verlag (1994); Sambrook, et aL, 
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in Molecular Cloning: A Laboratory Manual; Ausubel et al., Current Protocols in 
Molecular Biology. Polypeptide fragments that retain biological/immunological activity 
include fragments comprising greater than about 100 amino acids, or greater than about 
200 amino acids, and fragments that encode specific protein domains. 
5 The purified polypeptides can be used in in vitro binding assays which are well 

known in the art to identify molecules which bind to the polypeptides. These molecules 
include but are not limited to, for e.g., small molecules, molecules from combinatorial 
libraries, antibodies or other proteins. The molecules identified in the binding assay are 
then tested for antagonist or agonist activity in in vivo tissue culture or animal models 
10 that are well known in the art In brief, the molecules are titrated into a plurality of cell 
cultures or animals and then tested for either cell/animal death or prolonged survival of 
the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the 
peptides may be complexed with toxins, e.g., ricin or cholera, or with other compounds 
15 that are toxic to cells. The toxin-binding molecule complex is then targeted to a tumor or 
other cell by the specificity of the binding molecule for SEQ ID NO: 1-438. 

The protein of the invention may also be expressed as a product of transgenic 
animals, e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which 
are characterized by somatic or germ cells containing a nucleotide sequence encoding the 
20 protein. 

The proteins provided herein also include proteins characterized by amino acid 
sequences similar to those of purified proteins but into which modification are naturally 
provided or deliberately engineered. For example, modifications, in the peptide or DNA 
sequence, can be made by those skilled in the art using known techniques. Modifications 

25 of interest in the protein sequences may include the alteration, substitution, replacement, 
insertion or deletion of a selected amino acid residue in the coding sequence. For 
example, one or more of the cysteine residues may be deleted or replaced with another 
amino acid to alter the conformation of the molecule. Techniques for such alteration, 
substitution, replacement, insertion or deletion are well known to those skilled in the art 

30 (see, e.g., U.S. Pat No. 4,518,584). Preferably, such alteration, substitution, replacement, 
insertion or deletion retains the desired activity of the protein. Regions of the protein that 

36 



WO 02/081731 



PCT/US02/01222 



are important for the protein function can be determined by various methods known in 
the art including the alanine-scanning method which involved systematic substitution of 
single or strings of amino acids with alanine, followed by testing the resulting 
alanine-containing variant for biological activity. This type of analysis determines the 
5 importance of the substituted amino acid(s) in biological activity. Regions of the protein 
that are important for protein function may be determined by the eMATRK program. 

Other fragments and derivatives of the sequences of proteins which would be 
expected to retain protein activity in whole or in part and are useful for screening or other 
immunological methodologies may also be easily made by those skilled in the art given 

10 the disclosures herein. Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide 
of the invention to suitable control sequences in one or more insect expression vectors, 
and employing an insect expression system. Materials and methods for 
baculovirus/insect cell expression systems are commercially available in kit form from, 

15 e.g., Invitrogen, San Diego, Calif., U.S.A. (the MaxBat™ kit), and such methods are well 
known in the art, as described in Summers and Smith, Texas Agricultural Experiment 
Station Bulletin No. 1555 (1987), incorporated herein by reference. As used herein, an 
insect cell capable of expressing a polynucleotide of the present invention is 
"transformed." 

20 The protein of the invention may be prepared by culturing transformed host cells 

under culture conditions suitable to express the recombinant protein. The resulting 
expressed protein may then be purified from such culture (ie., from culture medium or 
cell extracts) using known purification processes, such as gel filtration and ion exchange 
chromatography. The purification of the protein may also include an affinity column 

25 containing agents which will bind to the protein; one or more column steps over such 
affinity resins as concanavalin A-agarose, heparin-toyopearl™ or Cibacrom blue 3GA 
Sepharose™; one or more steps involving hydrophobic interaction chromatography using 
such resins as phenyl ether, butyl ether, or propyl ether; or immunoaffinity 
chromatography. 

30 Alternatively, the protein of the invention may also be expressed in a form which 

will facilitate purification. For example, it may be expressed as a fusion protein, such as 
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those of maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin 
(TRX), or as a His tag. Kits for expression and purification of such fusion proteins are 
commercially available from New England BioLab (Beverly, Mass.), Pharmacia 
(Piscataway, N J.) and Invitrogen, respectively. The protein can also be tagged with an 
5 epitope and subsequently purified by using a specific antibody directed to such epitope. 
One such epitope ("FLAG®") is commercially available from Kodak (New Haven, 
Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- 
HPLC) steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant 

10 methyl or other aliphatic groups, can be employed to further purify the protein. Some or 
all of the foregoing purification steps, in various combinations, can also be employed to 
provide a substantially homogeneous isolated recombinant protein. The protein thus 
purified is substantially free of other mammalian proteins and is defined in accordance 
with the present invention as an "isolated protein." 

15 The polypeptides of the invention include analogs (variants). This embraces 

fragments, as well as peptides in which one or more amino acids has been deleted, 
inserted, or substituted. Also, analogs of the polypeptides of the invention embrace 
fusions of the polypeptides or modifications of the polypeptides of the invention, wherein 
the polypeptide or analog is fused to another moiety or moieties, e.g., targeting moiety or 

20 another therapeutic agent. Such analogs may exhibit improved properties such as activity 
and/or stability. Examples of moieties which may be fused to the polypeptide or an 
analog include, for example, targeting moieties which provide for the delivery of 
polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, antibodies to immune 
cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well as receptor 

25 and ligands expressed on pancreatic or immune cells. Other moieties which may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for 
example, immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 
antibodies and steroids. Also, polypeptides may be fused to immune modulators, and 
other cytokines such as alpha or beta interferon. 

30 
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4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE 
IDENTITY AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match 
between the sequences tested. Methods to determine identity and similarity are codified 
5 in computer programs including, but are not limited to, the GCG program package, 
including GAP (Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics 
Computer Group, University of Wisconsin, Madison, WI), BLASTP, BLASTN, 
BLASTX, FASTA (Altschul, SJF. et al., J. Molec. Biol. 215:403-410 (1990), PSI-BLAST 
(Altschul S.F. et al., Nucleic Acids Res. vol. 25, pp. 3389-3402, herein incorporated by 

10 reference), eMatrix software (Wu et al., J. Comp. Biol., Vol. 6, pp. 219-235 (1999), 

herein incorporated by reference), eMotif software (Nevill-Manning et al, ISMB-97, Vol. 
4, pp. 202-209, herein incorporated by reference), pFam software (Sonnhammer et al., 
Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by reference) 
and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 105-31 

15 (1982), incorporated herein by reference). The BLAST programs are publicly available 
from the National Center for Biotechnology Information (NCBI) and other sources 
(BLAST Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., 
et al., J. Mol. Biol. 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

20 The invention also provides chimeric or fusion proteins. As used herein, a 

"chimeric protein" or "fusion protein" comprises a* polypeptide of the invention 
operatively linked to another polypeptide. Within a fusion protein the polypeptide 
according to the invention can correspond to all or a portion of a protein according to the 
invention. In one embodiment, a fusion protein comprises at least one biologically active 

25 portion of a protein according to the invention. In another embodiment, a fusion protein 
comprises at least two biologically active portions of a protein according to the invention. 
Within the fusion protein, the term "operatively linked" is intended to indicate that the 
polypeptide according to the invention and the other polypeptide are fused in-frame to 
each other. The polypeptide can be fused to the N-terminus or C-terminus, or to the 

30 middle. 
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For example, in one embodiment a fusion protein comprises a polypeptide 
according to the invention operably linked to the extracellular domain of a second 
protein. 

In another embodiment, the fusion protein is a GST-fusion protein in which the 
5 polypeptide sequences of the invention are fused to the C-terminus of the GST (i.e., 
glutathione S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in 
which the polypeptide sequences according to the invention comprise one or more 
domains fused to sequences derived from a member of the immunoglobulin protein 

10 family. The immunoglobulin fusion proteins of the invention can be incorporated into 
pharmaceutical compositions and administered to a subject to inhibit an interaction 
between a ligand and a protein of the invention on the surface of a cell, to thereby 
suppress signal transduction in vivo. The immunoglobulin fusion proteins can be used to 
affect the bioavailability of a cognate ligand. Inhibition of the ligand/protein interaction 

15 may be useful therapeutically for both the treatment of proliferative and differentiative 
disorders, e.g., cancer as well as modulating (e.g., promoting or inhibiting) cell survival. 
Moreover, the immunoglobulin fusion proteins of the invention can be used as 
immunogens to produce antibodies in a subject, to purify ligands, and in screening assays 
to identify molecules that inhibit the interaction of a polypeptide of the invention with a 

20 ligand. 

A chimeric or fusion protein of the invention can be produced by standard 
recombinant DNA techniques. For example, DNA fragments coding for the different 
polypeptide sequences are ligated together in-frame in accordance with conventional 
techniques, e.g. , by employing blunt-ended or stagger-ended termini for ligation, 

25 restriction enzyme digestion to provide for appropriate termini, filling-in of cohesive ends 
as appropriate, alkaline phosphatase treatment to avoid undesirable joining, and 
enzymatic ligation. In another embodiment, the fusion gene can be synthesized by 
conventional techniques including automated DNA synthesizers. Alternatively, PCR 
amplification of gene fragments can be carried out using anchor primers that give rise to 

30 complementary overhangs between two consecutive gene fragments that can 

subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
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example, Ausubel et al. (eds.) Current Protocols in Molecular Biology, John 
Wiley & Sons, 1992). Moreover, many expression vectors are commercially available 
that already encode a fusion moiety (e.g., a GST polypeptide). A nucleic acid encoding a 
polypeptide of the invention can be cloned into such an expression vector such that the 
5 fusion moiety is linked in-frame to the protein of the invention. 

4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of 
normal function of the encoded protein. The invention thus provides gene therapy to 

10 restore normal activity of the polypeptides of the invention; or to treat disease states 
involving polypeptides of the invention. Delivery of a functional gene encoding 
polypeptides of the invention to appropriate cells is effected ex vivo, in situ, or in vivo by 
use of vectors, and more particularly viral vectors (e.g., adenovirus, adeno-associated 
virus, or a retrovirus), or ex vivo by use of physical DNA transfer methods (e.g., 

15 liposomes or chemical treatments). See, for example, Anderson, Nature, supplement to 
vol. 392, no. 6679, pp.25-20 (1998). For additional reviews of gene therapy technology 
see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific American: 68-84 
(1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of the 
nucleotides of the present invention or a gene encoding the polypeptides of the present 

20 invention can also be accomplished with extrachromosomal substrates (transient 

expression) or artificial chromosomes (stable expression). Cells may also be cultured ex 
vivo in the presence of proteins of the present invention in order to proliferate or to 
produce a desired effect on or activity in such cells. Treated cells can then be introduced 
in vivo for therapeutic purposes. Alternatively, it is contemplated that in other human 

25 disease states, preventing the expression of or inhibiting the activity of polypeptides of 
the invention will be useful in treating the disease states. It is contemplated that antisense 
therapy or gene therapy could be applied to negatively regulate the expression of 
polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of 

30 antisense molecules to the nucleic acids of the present invention, their complements, or their 
translated RNA sequences, by methods known in the art. Further, the polypeptides of the 
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present invention can be inhibited by using targeted deletion methods, or the insertion of a 
negative regulatory element such as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to 
express the polynucleotides of the invention, wherein such polynucleotides are in operative 
5 association with a regulatory sequence heterologous to the host cell which drives expression 
of the polynucleotides in the cell. These methods can be used to increase or decrease the 
expression of the polynucleotides of the present invention. 

Knowledge of DNA sequences provided by the invention allows for modification of 
cells to permit, increase, or decrease, expression of endogenous polypeptide. Cells can be 

10 modified (e.g., by homologous recombination) to provide increased polypeptide expression 
by replacing, in whole or in part, the naturally occurring promoter with all or part of a 
heterologous promoter so that the cells express the protein at higher levels. The heterologous 
promoter is inserted in such a manner that it is operatively linked to the desired protein 
encoding sequences. See, for example, PCT International Publication No. WO 94/12650, 

15 PCT International Publication No. WO 92/20808, and PCT International Publication No. 
WO 91/09955. It is also contemplated that, in addition to heterologous promoter DNA, 
amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which encodes 
carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or intron 
DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 

20 protein coding sequence, amplification of the marker DNA by standard selection methods 
results in co-amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered 
to express an endogenous gene comprising the polynucleotides of the invention under die 
control of inducible regulatory elements, in which case the regulatory sequences of the 

25 endogenous gene may be replaced by homologous recombination. As described herein, 
gene targeting can be used to replace a gene's existing regulatory region with a regulatory 
sequence isolated from a different gene or a novel regulatory sequence synthesized by 
genetic engineering methods. Such regulatory sequences may be comprised of promoters, 
enhancers, scaffold-attachment regions, negative regulatory elements, transcriptional 

30 initiation sites, regulatory protein binding sites or combinations of said sequences. 
Alternatively, sequences which affect the structure or stability of the KNA or protein 
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produced may be replaced, removed, added, or otherwise modified by targeting. These 
sequences include polyadenylation signals, mRNA stability elements, splice sites, leader 
sequences for enhancing or modifying transport or secretion properties of the protein, or 
other sequences which alter or improve the function or stability of protein or RNA 
5 molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple 
deletion of a regulatory element, such as the deletion of a tissue-specific negative regulatory 

10 element. Alternatively, the targeting event may replace an existing element; for example, a 
tissue-specific enhancer can be replaced by an enhancer that has broader or different 
cell-type specificity than the naturally occurring elements. Here, the naturally occurring 
sequences are deleted and new sequences are added. In all cases, the identification of the 
targeting event may be facilitated by the use of one or more selectable marker genes that are 

15 contiguous with the targeting DNA, allowing for the selection of cells in which the 

exogenous DNA has integrated into the cell genome. The identification of the targeting 
event may also be facilitated by the use of one or more marker genes exhibiting the property 
of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting 

20 sequence, and such that a correct homologous recombination event with sequences in the 
host cell genome does not result in the stable integration of the negatively selectable marker. 
Markers useful for this purpose include the Herpes Simplex Virus thymidine kinase (TK) 
gene or the bacterial xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance 

25 with this aspect of the invention are more particularly described in U.S. Patent No. 

5,272,071 to Chappel; U.S. Patent No. 5,578,461 to Sherwin et aL; International Application 
No. PCT/US92/09627 (WO93/09222) by Selden et aL; and International Application No. 
PCT/US90/06436 (WO91/06667) by Skoultchi et aL, each of which is incorporated by 
reference herein in its entirety. 

30 

4.9 TRANSGENIC ANIMALS 
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In preferred methods to determine biological functions of the polypeptides of the* 
invention in vivo, one or more genes provided by the invention are either over expressed 
or inactivated in the germ line of animals using homologous recombination [Capecchi, 
Science 244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the 
5 regulatory control of exogenous or endogenous promoter elements, are known as 
transgenic animals. Animals in which an endogenous gene has been inactivated by 
homologous recombination are referred to as "knockout" animals. Knockout animals, 
preferably non-human mammals, can be prepared as described in U.S. Patent No. 
5,557,032, incorporated herein by reference. Transgenic animals are useful to determine 

10 the roles polypeptides of the invention play in biological processes, and preferably in 
disease states. Transgenic animals are useful as model systems to identify compounds 
that modulate lipid metabolism. Transgenic animals, preferably non-human mammals, 
are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
Publication No. W094/28122, incorporated herein by reference. 

15 Transgenic animals can be prepared wherein all or part of a promoter of the 

polynucleotides of the invention is either activated or inactivated to alter the level of 
expression of the polypeptides of the invention. Inactivation can be carried out using 
homologous recombination methods described above. Activation can be achieved by 
supplementing or even replacing the homologous promoter to provide for increased 

20 protein expression. The homologous promoter can be supplemented by insertion of one 
or more heterologous enhancer elements known to confer promoter activation in a 
particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to 
25 express polypeptides of the invention or that express a variant polypeptide. Such animals 
are useful as models for studying the in vivo activities of polypeptide as well as for 
studying modulators of the polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed 
30 or inactivated in the germ line of animals using homologous recombination [Capecchi, 
Science 244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the 
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regulatory control of exogenous or endogenous promoter elements, are known as 
transgenic animals. Animals in which an endogenous gene has been inactivated by 
homologous recombination are referred to as "knockout" animals. Knockout animals, 
preferably non-human mammals, can be prepared as described in U.S. Patent No. 
5 5,557,032, incorporated herein by reference. Transgenic animals are useful to determine 
the roles polypeptides of the invention play in biological processes, and preferably in 
disease states. Transgenic animals are useful as model systems to identify compounds 
that modulate lipid metabolism. Transgenic animals, preferably non-human mammals, 
are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

10 Publication No. W094/28 1 22, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of the polynucleotides of 
the invention promoter is either activated or inactivated to alter the level of expression of 
the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or 

15 even replacing the homologous promoter to provide for increased protein expression. The 
homologous promoter can be supplemented by insertion of one or more heterologous 
enhancer elements known to confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

20 The polynucleotides and proteins of the present invention are expected to exhibit 

one or more of the uses or biological activities (including those associated with assays 
cited herein) identified herein. Uses or activities described for proteins of the present 
invention may be provided by administration or use of such proteins or of 
polynucleotides encoding such proteins (such as, for example, in gene therapies or 

25 vectors suitable for introduction of DNA). The mechanism underlying the particular 
condition or pathology will dictate whether the polypeptides of the invention, the 
polynucleotides of the invention or modulators (activators or inhibitors) thereof would be 
beneficial to the subject in need of treatment. Thus, "therapeutic compositions of the . 
invention" include compositions comprising isolated polynucleotides (including 

30 • recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
polypeptides of the invention (including full length protein, mature protein and 
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truncations or domains thereof), or compounds and other substances that modulate the 
overall activity of the target gene products, either at the level of target gene/protein 
expression or target protein activity. Such modulators include polypeptides, analogs, 
(variants), including fragments and fusion proteins, antibodies and other binding proteins; 
5 chemical compounds that directly or indirectly activate or inhibit the polypeptides of the 
invention (identified, e.g., via drug screening assays as described herein); antisense 
polynucleotides and polynucleotides suitable for triple helix formation; and in particular 
antibodies or other binding partners that specifically recognize one or more epitopes of 
the polypeptides of the invention. 
10 The polypeptides of the present invention may likewise be involved in cellular 

activation or in one of the other physiological pathways described herein. 

4.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the 

15 research community for various purposes. The polynucleotides can be used to express 
recombinant protein for analysis, characterization or therapeutic use; as markers for 
tissues in which the corresponding protein is preferentially expressed (either 
constitutively or at a particular stage of tissue differentiation or development or in disease 
states); as molecular weight markers on gels; as chromosome markers or tags (when 

20 labeled) to identify chromosomes or to map related gene positions; to compare with 

endogenous DNA sequences in patients to identify potential genetic disorders; as probes 
to hybridize and thus discover novel, related DNA sequences; as a source of information 
to derive PCR primers for genetic fingerprinting; as a probe to "subtract-out" known 
sequences in the process of discovering other novel polynucleotides; for selecting and 

25 making oligomers for attachment to a "gene chip" or other support, including for 
examination of expression patterns; to raise anti-protein antibodies using DNA 
immunization techniques; and as an antigen to raise anti-DNA antibodies or elicit another 
immune response. Where the polynucleotide encodes a protein which binds or 
potentially binds to another protein (such as, for example, in a receptor-ligand 

30 interaction), the polynucleotide can also be used in interaction trap assays (such as, for 
example, that described in Gyuris et aL, Cell 75:791-803 (1993)) to identify 
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polynucleotides encoding the other protein with which binding occurs or to identify 
inhibitors of the binding interaction. 

The polypeptides provided by the present invention can similarly be used in 
assays to determine biological activity, including in a panel of multiple proteins for 
5 high-throughput screening; to raise antibodies or to elicit another immune response; as a 
reagent (including the labeled reagent) in assays designed to quantitatively determine 
levels of the protein (or its receptor) in biological fluids; as markers for tissues in which 
the corresponding polypeptide is preferentially expressed (either constitutively or at a 
particular stage of tissue differentiation or development or in a disease state); and, of 

10 course, to isolate correlative receptors or ligands. Proteins involved in these binding 

interactions can also be used to screen for peptide or small molecule inhibitors or agonists 
of the binding interaction. 

Any or all of these research utilities are capable of being developed into reagent 
grade or kit format for commercialization as research products. 

15 Methods for performing the uses listed above are well known to those skilled in 

the art. References disclosing such methods include without limitation "Molecular 
Cloning: A Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, 
Sambrook, J., E. F. Fritsch and T. Maniatis eds., 1989, and "Methods in Enzymology: 
Guide to Molecular Cloning Techniques", Academic Press, Berger, S. L. and A. R. 

20 Kimmel eds., 1987. 

4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as 
nutritional sources or supplements. Such uses include without limitation use as a protein or 

25 amino acid supplement, use as a carbon source, use as a nitrogen source and use as a source 
of carbohydrate. In such cases the polypeptide or polynucleotide of the invention can be 
added to the feed of a particular organism or can be administered as a separate solid or liquid 
preparation, such as in the form of powder, pills, solutions, suspensions or capsules. In the 
case of microorganisms, the polypeptide or polynucleotide of the invention can be added to 

30 the medium in or on which the microorganism is cultured. 
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4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

A polypeptide of the present invention may exhibit activity relating to cytokine, 
cell proliferation (either inducing or inhibiting) or cell differentiation (either inducing or 
5 inhibiting) activity or may induce production of other cytokines in certain cell 

populations. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Many protein factors discovered to date, including all known cytokines, have 
exhibited activity in one or more factor-dependent cell proliferation assays, and hence the 
assays serve as a convenient confirmation of cytokine activity. The activity of therapeutic 

10 compositions of the present invention is evidenced by any one of a number of routine 

factor dependent cell proliferation assays for cell lines including, without limitation, 32D, 
DA2, DA1G, T10, B9, B9/11, BaF3, MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, 
T1165, HT2, CTLL2, TF-1, Mo7e, CMK, HUVEC, and Caco. Therapeutic compositions 
of the invention can be used in the following: 

15 Assays for T-cell or thymocyte proliferation include without limitation those 

described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Takai et al., J. Immunol. 137:3494-3500, 

20 1986; Bertagnolli et al., J. Immunol. 145:1706-1712, 1990; Bertagnolli et al., Cellular 
Immunology 133:327-341, 1991; Bertagnolli, et al., I. Immunol. 149:3778-3783, 1992; 
Bowman et al., L Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node 
cells or thymocytes include, without limitation, those described in: Polyclonal T cell 

25 stimulation, Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. 
J. E. e.a. Coligan eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and 
Measurement of mouse and human interleukin-Y, Schreiber, R. D. In Current Protocols in 
Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 
1994. 

30 Assays for proliferation and differentiation of hematopoietic and lymphopoietic 

cells include, without limitation, those described in: Measurement of Human and Murine 
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Interleukin 2 and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current 
Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and 
Sons, Toronto. 1991; deVries et aL, J. Exp. Med. 173:1205-1211, 1991; Moreau et al., 
Nature 336:690-692, 1988; Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 
5 80:2931-2938, 1983; Measurement of mouse and human interleukin 6-Nordan, R. In 
Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 6.6.1-6.6.5, John Wiley 
and Sons, Toronto. 1991; Smith et al., Proc. Natl. Aced. Sci. U.S.A. 83:1857-1861, 1986; 
Measurement of human Interleukin 1 1-Bennett, R, Giannotti, J., Clark, S. C. and Turner, 
K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 6.15.1 John 

10 Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin 
9~Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in 
Immunology. J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, 
proteins that affect APC-T cell interactions as well as direct T-cell effects by measuring 

15 proliferation and cytokine production) include, without limitation, those described in: 
Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. 
Margulies, E. M. Shevach, W Strober, Pub. Greene Publishing Associates and 
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function; Chapter 
6, Cytokines and their cellular receptors; Chapter 7, Immunologic studies in Humans); 

20 Weinberger et al., Proc. Natl. Acad. Sci. USA 77:6091-6095, 1980; Weinberger et al., 
Eur. J. Immun. 11:405-411, 1981; Takai et al., J. Immunol. 137:3494-3500, 1986; Takai 
et al., J. Immunol. 140:508-512, 1988. 

4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

25 A polypeptide of the present invention may exhibit stem cell growth factor 

activity and be involved in the proliferation, differentiation and survival of pluripotent 
and totipotent stem cells including primordial germ cells, embryonic stem cells, 
hematopoietic stem cells and/or germ line stem cells. Administration of the polypeptide 
of the invention to stem cells in vivo or ex vivo is expected to maintain and expand cell 

30 populations in a totipotential or pluripotential state which would be useful for re- 
engineering damaged or diseased tissues, transplantation, manufacture of bio- 
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pharmaceuticals and the development of bio-sensors. The ability to produce large 
quantities of human cells has important working applications for the production of human 
proteins which currently must be obtained from non-human sources or donors, 
implantation of cells to treat diseases such as Parkinson's, Alzheimer's and other 
5 neurodegenerative diseases; tissues for grafting such as bone marrow, skin, cartilage, 
tendons, bone, muscle (including cardiac muscle), blood vessels, cornea, neural cells, 
gastrointestinal cells and others; and organs for transplantation such as kidney, liver, 
pancreas (including islet cells), heart and lung. 

It is contemplated that multiple different exogenous growth factors and/or 

10 cytokines may be administered in combination with the polypeptide of the invention to 
achieve the desired effect, including any of the growth factors listed herein, other stem 
cell maintenance factors, and specifically including stem cell factor (SCF), leukemia 
inhibitory factor (LIF), Flt-3 ligand (Flt-3L), any of the interleukins, recombinant soluble 
ILr6 receptor fused to 1L-6, macrophage inflammatory protein 1-alpha (MEM-alpha), G- 

15 CSF, GM-CSF, thrombopoietin (TPO), platelet factor 4 (PF-4), platelet-derived growth 
factor (PDGF), neural growth factors and basic fibroblast growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, 
expansion of these cells in culture will facilitate the production of large quantities of 
mature cells. Techniques for culturing stem cells are known in the art and administration 

20 of polypeptides of the invention, optionally with other growth factors and/or cytokines, is 
expected to enhance the survival and proliferation of the stem cell populations. This can 
be accomplished by direct administration of the polypeptide of the invention to the 
culture medium. Alternatively, stroma cells transfected with a polynucleotide that 
encodes for the polypeptide of the invention can be used as a feeder layer for the stem 

25 cell populations in culture or in vivo. Stromal support cells for feeder layers may include 
embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to 
induce autocrine expression of the polypeptide of the invention. This will allow for 

30 generation of undifferentiated totipotentia]7pluiipotential stem cell lines that are useful as 
is or that can then be differentiated into the desired mature cell types. These stable cell 
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lines can also serve as a source of undifferentiated totipotential/pluripotential mRNA to 
create cDNA libraries and templates for polymerase chain reaction experiments. These 
studies would allow for the isolation and identification of differentially expressed genes 
in stem cell populations that regulate stem cell proliferation and/or maintenance. 
5 Expansion and maintenance of totipotent stem cell populations will be useful in 

the treatment of many pathological conditions. For example, polypeptides of the present 
invention may be used to manipulate stem cells in culture to give rise to neuroepithelial 
cells that can be used to augment or replace cells damaged by illness, autoimmune 
disease, accidental damage or genetic disorders. The polypeptide of the invention may be 

10 useful for inducing the proliferation of neural cells and for the regeneration of nerve and 
brain tissue, i.e. for the treatment of central and peripheral nervous system diseases and 
neuropathies, as well as mechanical and traumatic disorders which involve degeneration, 
death or trauma to neural cells or nerve tissue. In addition, the expanded stem cell 
populations can also be genetically altered for gene therapy purposes and to decrease host 

15 rejection of replacement tissues after grafting or implantation- 
Expression of the polypeptide of the invention and its effect on stem cells can also 
be manipulated to achieve controlled differentiation of the stem cells into more 
differentiated cell types. A broadly applicable method of obtaining pure populations of a 
specific differentiated cell type from undifferentiated stem cell populations involves the 

20 use of a cell-type specific promoter driving a selectable marker. The selectable marker 
allows only cells of the desired type to survive. For example, stem cells can be induced 
to differentiate into cardiomyocytes (Wobus et al., Differentiation, 48: 173-182, (1991); 
Klug et al., J. Clin. Invest., 98(1): 216-224, (1998)) or skeletal muscle cells (Browder, L. 
W. In: Principles of Tissue Engineering eds. Lanza et al., Academic Press (1997)). 

25 Alternatively, directed differentiation of stem cells can be accomplished by culturing the 
stem cells in the presence of a differentiation factor such as retinoic acid and an 
antagonist of the polypeptide of the invention which would inhibit the effects of 
endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the 

30 invention exhibits stem cell growth factor activity. Stem cells are isolated from any one 
of various cell sources (including hematopoietic stem cells and embryonic stem cells) and 
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cultured on a feeder layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 
92: 7844-7848 (1995), in the presence of the polypeptide of the invention alone or in 
combination with other growth factors or cytokines. The ability of the polypeptide of the 
invention to induce stem cells proliferation is determined by colony formation on semi- 
5 solid support e.g. as described by Bernstein et al., Blood, 77: 2316-2321 (1991). 

4.105 HEMATOPOBESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of 
hematopoiesis and, consequently, in the treatment of myeloid or lymphoid cell disorders. 

10 Even marginal biological activity in support of colony forming cells or of 

factor-dependent cell lines indicates involvement in regulating hematopoiesis, e.g. in 
supporting the growth and proliferation of erythroid progenitor cells alone or in 
combination with other cytokines, thereby indicating utility, for example, in treating 
various anemias or for use in conjunction with irradiation/chemotherapy to stimulate the 

15 production of erythroid precursors and/or erythroid cells; in supporting the growth and 
proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., 
traditional CSF activity) useful, for example, in conjunction with chemotherapy to 
prevent or treat consequent myelo-suppression; in supporting the growth and proliferation 
of megakaryocytes and consequently of platelets thereby allowing prevention or 

20 treatment of various platelet disorders such as thrombocytopenia, and generally for use in 
place of or complimentary to platelet transfusions; and/or in supporting the growth and 
proliferation of hematopoietic stem cells which are capable of maturing to any and all of 
the above-mentioned hematopoietic cells and therefore find therapeutic utility in various 
stem cell disorders (such as those usually treated with transplantation, including, without 

25 limitation, aplastic anemia and paroxysmal nocturnal hemoglobinuria), as well as in 
repopulating the stem cell compartment post irradiation/chemotherapy, either in-vivo or 
ex-vivo (i.e., in conjunction with bone marrow transplantation or with peripheral 
progenitor cell transplantation (homologous or heterologous)) as normal cells or 
genetically manipulated for gene therapy. 

30 Therapeutic compositions of the invention can be used in the following: 



52 



WO 02/081731 



PCT/US02/01222 



Suitable assays for proliferation and differentiation of various hematopoietic lines 
are cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without 
5 limitation, those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller 
et al., Molecular and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 
81:2903-2915, 1993. 

Assays for stem cell survival and differentiation (which will identify, among 
others, proteins that regulate lympho-hematopoiesis) include, without limitation, those 

10 described in: Methylcellulose colony forming assays, Freshney, M. G. In Culture of 
Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 265-268, Wiley-Iiss, Inc., New 
York, N.Y. 1994; Hirayamaet al., Proc. Natl. Acad Sci. USA 89:5907-5911, 1992; 
Primitive hematopoietic colony forming cells with high proliferative potential, McNiece, 
- L K. and Briddell, R. A. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol 

15 pp. 23-39, Wiley-Iiss, Inc., New York, N.Y. 1994; Neben et al., Experimental 

Hematology 22:353-359, 1994; Cobblestone area forming cell assay, Ploemacher, R. E. 
In Culture of Hematopoietic Cells. R. I Freshney, et al. eds. Vol pp. 1-21, Wiley-Iiss, 
Inc., New York, N.Y. 1994; Long term bone marrow cultures in the presence of stromal 
cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. L 

20 Freshney, et al. eds. Vol pp. 163-179, Wiley-Iiss, Inc., New Yoric, N.Y. 1994; Long term 
culture initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. L 
Freshney, et al. eds. Vol pp. 139-162, Wiley-Iiss, Inc., New York, N.Y. 1994. 

4.10,6 TISSUE GROWTH ACTIVITY 

25 A polypeptide of the present invention also may be involved in bone, cartilage, 

tendon, ligament and/or nerve tissue growth or regeneration, as well as in wound healing 
and tissue repair and replacement, and in healing of bums, incisions and ulcers. 

A polypeptide of the present invention which induces cartilage and/or bone 
growth in circumstances where bone is not normally formed, has application in the 

30 healing of bone fractures and cartilage damage or defects in humans and other animals. 
Compositions of a polypeptide, antibody, binding partner, or other modulator of the 
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invention may have prophylactic use in closed as well as open fracture reduction and also 
in the improved fixation of artificial joints. De novo bone formation induced by an 
osteogenic agent contributes to the repair of congenital, trauma induced, or oncologic 
resection induced craniofacial defects, and also is useful in cosmetic plastic surgery. 
5 A polypeptide of this invention may also be involved in attracting bone-forming 

cells, stimulating growth of bone-forming cells, or inducing differentiation of progenitors 
of bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative 
disorders, or periodontal disease, such as through stimulation of bone and/or cartilage 
repair or by blocking inflammation or processes of tissue destruction (collagenase 

10 activity, osteoclast activity, etc.) mediated by inflammatory processes may also be 
possible using the composition of the invention. 

Another category of tissue regeneration activity that may involve the polypeptide 
of the present invention is tendon/ligament formation. Induction of tendon/ligament-like 
tissue or other tissue formation in circumstances where such tissue is not normally 

15 formed, has application in the healing of tendon or ligament tears, deformities and other 
tendon or ligament defects in humans and other animals. Such a preparation employing a 
tendon/ligament-like tissue inducing protein may have prophylactic use in preventing 
damage to tendon or ligament tissue, as well as use in the improved fixation of tendon or 
ligament to bone or other tissues, and in repairing defects to tendon or ligament tissue. 

20 De novo tendon/ligament-like tissue formation induced by a composition of the present 
invention contributes to the repair of congenital, trauma induced, or other tendon or 
ligament defects of other origin, and is also useful in cosmetic plastic surgery for 
attachment or repair of tendons or ligaments. The compositions of the present invention 
may provide environment to attract tendon- or ligament-forming cells, stimulate growth 

25 of tendon- or ligament-forming cells, induce differentiation of progenitors of tendon- or 
ligament-forming cells, or induce growth of tendon/ligament cells or progenitors ex vivo 
for return in vivo to effect tissue repair. The compositions of the invention may also be 
useful in the treatment of tendinitis, carpal tunnel syndrome and other tendon or ligament 
defects. The compositions may also include an appropriate matrix and/or sequestering 

30 agent as a carrier as is well known in the art. 



54 



WO 02/081731 



PCT/US02/01222 



The compositions of the present invention may also be useful for proliferation of 
neural cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and 
traumatic disorders, which involve degeneration, death or trauma to neural cells or nerve 
5 tissue. More specifically, a composition may be used in the treatment of diseases of the 
peripheral nervous system, such as peripheral nerve injuries, peripheral neuropathy and 
localized neuropathies, and central nervous system diseases, such as Alzheimer's, 
Parkinson's disease, Huntington's disease, amyotrophic lateral sclerosis, and Shy-Drager 
syndrome. Further conditions which may be treated in accordance with the present 
10 invention include mechanical and traumatic disorders, such as spinal cord disorders, head 
trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies resulting 
from chemotherapy or other medical therapies may also be treatable using a composition 
of the invention. 

Compositions of the invention may also be useful to promote better or faster 
15 closure of non-healing wounds, including without limitation pressure ulcers, ulcers 
associated with vascular insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, 
intestine, kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular 
20 (including vascular endothelium) tissue, or for promoting the growth of cells comprising 
such tissues. Part of the desired effects may be by inhibition or modulation of fibrotic 
scarring may allow normal tissue to regenerate. A polypeptide of the present invention 
may also exhibit angiogenic activity. 

A composition of the present invention may also be useful for gut protection or 
25 regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, 
and conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or 
inhibiting differentiation of tissues described above from precursor tissues or cells; or for 
inhibiting the growth of tissues described above. 
30 Therapeutic compositions of the invention can be used in the following: 
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Assays for tissue generation activity include, without limitation, those described 
in: International Patent Publication No. WO95/16035 (bone, cartilage, tendon); 
International Patent Publication No. WO95/05846 (nerve, neuronal); International Patent 
Publication No. WO91/07491 (skin, endothelium). 
5 Assays for wound healing activity include, without limitation, those described in: 

Winter, Epidermal Wound Healing, pps. 71-1 12 (Maibach, IL L and Rovee, D. T., eds.), 
Year Book Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. 
Invest Dermatol 71:382-84 (1978). 

10 4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

A polypeptide of the present invention may also exhibit immune stimulating or 
immune suppressing activity, including without limitation the activities for which assays 
are described herein. A polynucleotide of the invention can encode a polypeptide 
exhibiting such activities. A protein may be useful in the treatment of various immune 

15 deficiencies and disorders (including severe combined immunodeficiency (SCDD)), e.g., 
in regulating (up or down) growth and proliferation of T and/or B lymphocytes, as well as 
effecting the cytolytic activity of NK cells and other cell populations. These immune 
deficiencies may be genetic or be caused by viral (e.g., HIV) as well as bacterial or 
fungal infections, or may result from autoimmune disorders. More specifically, infectious 

20 diseases causes by viral, bacterial, fungal or other infection may be treatable using a 
protein of the present invention, including infections by HIV, hepatitis viruses, herpes 
viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be 
useful where a boost to the immune system generally may be desirable, i.e., in the 

25 treatment of cancer. 

Autoimmune disorders which may be treated using a protein of the present 
invention include, for example, connective tissue disease, multiple sclerosis, systemic 
lupus erythematosus, rheumatoid arthritis, autoimmune pulmonary inflammation, 
Guillain-Barre syndrome, autoimmune thyroiditis, insulin dependent diabetes mellitis, 

30 myasthenia gravis, graft-versus-host disease and autoimmune inflammatory eye disease. 
Such a protein (or antagonists thereof, including antibodies) of the present invention may 
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also to be useful in the treatment of allergic reactions and conditions (e.g., anaphylaxis, 
serum sickness, drug reactions, food allergies, insect venom allergies, mastocytosis, 
allergic rhinitis, hypersensitivity pneumonitis, urticaria, angioedema, eczema, atopic 
dermatitis, allergic contact dermatitis, erythema multiforme, Stevens-Johnson syndrome, 
5 allergic conjunctivitis, atopic keratoconjunctivitis, venereal keratoconjunctivitis, giant 
papillary conjunctivitis and contact allergies), such as asthma (particularly allergic 
asthma) or other respiratory problems. Other conditions, in which immune suppression is 
desired (including, for example, organ transplantation), may also be treatable using a 
protein (or antagonists thereof) of the present invention. The therapeutic effects of the 

10 polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo 
animals models such as the cumulative contact enhancement test (Lastbom et al., 
Toxicology 125: 59-66, 1998), skin prick test (Hoffmann et al., Allergy 54: 446-54, 
1999), guinea pig skin sensitization test (Vohr et al., Arch. Toxocol. 73: 501-9), and 
murine local lymph node assay (Kimber et al., J. Toxicol. Environ. Health 53: 563-79). 

15 Using the proteins of the invention it may also be possible to modulate immune 

responses, in a number of ways. Down regulation may be in the form of inhibiting or 
blocking an immune response already in progress or may involve preventing the 
induction of an immune response. The functions of activated T cells may be inhibited by 
suppressing T cell responses or by inducing specific tolerance in T cells, or both. 

20 Immunosuppression of T cell responses is generally an active, non-antigen-specific, 
process which requires continuous exposure of the T cells to the suppressive agent 
Tolerance, which involves inducing non-responsiveness or anergy in T cells, is 
distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 

25 demonstrated by the lack of a T cell response upon reexposure to specific antigen in the 
absence of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 
limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing 
high level lymphokine synthesis by activated T cells, will be useful in situations of tissue, 

30 skin and organ transplantation and in graft-versus-host disease (GVHD). For example, 
blockage of T cell function should result in reduced tissue destruction in tissue 
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transplantation. Typically, in tissue transplants, rejection of the transplant is initiated 
through its recognition as foreign by T cells, followed by an immune reaction that 
destroys the transplant. The administration of a therapeutic composition of the invention 
may prevent cytokine synthesis by immune cells, such as T cells, and thus acts as an 
5 immunosuppressant. Moreover, a lack of costimulation may also be sufficient to anergize 
the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance by B 
lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 
of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a 
subject, it may also be necessary to block the function of a combination of B lymphocyte 
10 antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac 
grafts in rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been 

15 used to examine the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as 
described in Lenschow et al., Science 257:789-792 (1992) and Turka et aL 9 Proc. Natl. 
Acad. Sci USA, 89: 1 1 102-1 1 105 (1992). In addition, murine models of GVHD (see Paul 
ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 846-847) can be used 
to determine the effect of therapeutic compositions of the invention on the development 

20 of that disease. 

Blocking antigen function may also be therapeutically useful for treating 
autoimmune diseases. Many autoimmune disorders are the result of inappropriate 
activation of T cells that are reactive against self tissue and which promote the production 
of cytokines and autoantibodies involved in the pathology of the diseases. Preventing the 

25 activation of autoreactive T cells may reduce or eliminate disease symptoms. 

Administration of reagents which block stimulation of T cells can be used to inhibit T 
cell activation and prevent production of autoantibodies or T cell-derived cytokines 
which may be involved in the disease process. Additionally, blocking reagents may 
induce antigen-specific tolerance of autoreactive T cells which could lead to long-term 

30 relief from the disease. The efficacy of blocking reagents in preventing or alleviating 
autoimmune disorders can be determined using a number of well-characterized animal 
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models of human autoimmune diseases. Examples include murine experimental 
autoimmune encephalitis, systemic lupus erythmatosis in MRL/lpr/lpr mice orNZB 
hybrid mice, murine autoimmune collagen arthritis, diabetes mellitus in NOD mice and 
BB rats, and murine experimental myasthenia gravis (see Paul ed., Fundamental 
5 Immunology, Raven Press, New York, 1989, pp. 840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a 
means of up regulating immune responses, may also be useful in therapy. Upregulation of 
immune responses may be in the form of enhancing an existing immune response or 
eliciting an initial immune response. For example, enhancing an immune response may 

10 be useful in cases of viral infection, including systemic viral diseases such as influenza, 
the common cold, and encephalitis. 

Alternatively, anti-viral immune responses may be enhanced in an infected patient 
by removing T cells from the patient, costimulating the T cells in vitro with viral 
antigen-pulsed APCs either expressing a peptide of the present invention or together with 

15 a stimulatory form of a soluble peptide of the present invention and reintroducing the in 
vitro activated T cells into the patient. Another method of enhancing anti-viral immune 
responses would be to isolate infected cells from a patient, transfect them with a nucleic 
acid encoding a protein of the present invention as described herein such that the cells 
express all or a portion of the protein on their surface, and reintroduce the transfected 

20 cells into the patient. The infected cells would now be capable of delivering a 
costimulatory signal to, and thereby activate, T cells in vivo. 

A polypeptide of the present invention may provide the necessary stimulation 
signal to T cells to induce a T cell mediated immune response against the transfected 
tumor cells. In addition, tumor cells which lack MHC class I or MHC class II molecules, 

25 or which fail to reexpress sufficient mounts of MHC class I or MHC class II molecules, 
can be transfected with nucleic acid encoding all or a portion of (e.g., a 
cytoplasmic-domain truncated portion) of an MHC class I alpha chain protein and 02 
microglobulin protein or an MHC class II alpha chain protein and an MHC class II beta 
chain protein to thereby express MHC class I or MHC class II proteins on the cell 

30 surface. Expression of the appropriate class I or class II MHC in conjunction with a 

peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a 
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T cell mediated immune response against the transfected tumor cell. Optionally, a gene 
encoding an antisense construct which blocks expression of an MHC class H associated 
protein, such as the invariant chain, can also be cotransfected with a DNA encoding a 
peptide having the activity of a B lymphocyte antigen to promote presentation of tumor 
5 associated antigens and induce tumor specific immunity. Thus, the induction of a T cell 
mediated immune response in a human subject may be sufficient to overcome 
tumor-specific tolerance in the subject. 

The activity of a protein of the invention may, among other means, be measured 
by the following methods: 

10 Suitable assays for thymocyte or splenocyte cytotoxicity include, without 

limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 
M. Kruisbeek, D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing 
Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte 
Function 3.1-3.19; Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. 

15 Nad. Acad. Sci. USA 78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 
1982; Handa et al„ J. Immunol. 135:1564-1572, 1985; Takai et al., L Immunol. 
137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988; Bowman et al., J. 
Virology 61:1992-1998; Bertagnolli et al., Cellular Immunology 133:327-341, 1991; 
Brown et al., J. Immunol. 153:3079-3092, 1994. 

20 Assays for T-cell-dependent immunoglobulin responses and isotype switching 

(which will identify, among others, proteins that modulate T-cell dependent antibody 
responses and that affect Thl/Th2 profiles) include, without limitation, those described 
in: Maliszewski, J. Immunol. 144:3028-3033, 1990; and Assays for B cell function: In 
vitro antibody production, Mond, J. J. and Brunswick, M. In Current Protocols in 

25 Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 3.8.1-3.8.16, John Wiley and Sons, 
Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, 
proteins that generate predominantly Thl and CTL responses) include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. 
30 Kruisbeek, D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing 

Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte 
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Function 3.1-3.19; Chapter 7, Immunologic studies in Humans); Takai et al., J. Immunol. 
137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988; Bertagnolli et al., J. 
Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins 
5 expressed by dendritic cells that activate naive T-cells) include, without limitation, those 
described in: Guery et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of 
Experimental Medicine 173:549-559, 1991; Macatoniaet al., Journal of Immunology 
154:5071-5079, 1995; Porgador et al., Journal of Experimental Medicine 182:255-260, 
1995; Nair et al., Journal of Virology 67:4062-4069, 1993; Huang et al., Science 

10 264:961-965, 1994; Macatonia et al., Journal of Experimental Medicine 169:1255-1264, 
1989; Bhardwaj et al., Journal of Clinical Investigation 94:797-807, 1994; and Inaba et 
al., Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, 
proteins that prevent apoptosis after superantigen induction and proteins that regulate 

15 lymphocyte homeostasis) include, without limitation, those described in: Darzynkiewicz 
et al., Cytometry 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; 
Gorczyca et al., Cancer Research 53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991; 
Zacharchuk, Journal of Immunology 145:4037-4045, 1990; Zamai et al., Cytometry 
14:891-897, 1993; Gorczyca et al., International Journal of Oncology 1:639-648, 1992. 

20 Assays for proteins that influence early steps of T-cell commitment and 

development include, without limitation, those described in: Antica et al., Blood 
84:111-117, 1994; Fine et al., Cellular Immunology 155:111-122, 1994; Galy et al., 
Blood 85:2770-2778, 1995; Toki et al., Proc. Nat Acad Sci. USA 88:7548-7551, 1991. 

25 4,10.8 ACTIVIN/INHIBIN ACTIVITY 

A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to 
30 stimulate the release of follicle stimulating hormone (FSH). Thus, a polypeptide of the 
present invention, alone or in heterodimers with a member of the inhibin family, may be 
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useful as a contraceptive based on the ability of inhibins to decrease fertility in female 
mammals and decrease spermatogenesis in male mammals. Administration of sufficient 
amounts of other inhibins can induce infertility in these mammals. Alternatively, the 
polypeptide of the invention, as a homodimer or as a heterodimer with other protein 
5 subunits of the inhibin group, may be useful as a fertility inducing therapeutic, based 
upon the ability of activin molecules in stimulating FSH release from cells of the anterior 
pituitary. See, for example, U.S. Pat. No. 4,798,885. A polypeptide of the invention may 
also be useful for advancement of the onset of fertility in sexually immature mammals, so 
as to increase the lifetime reproductive performance of domestic animals such as, but not 
10 limited to, cows, sheep and pigs. 

The activity of a polypeptide of the invention may, among other means, be 
measured by the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: 
Vale et al., Endocrinology 91:562-572, 1972; Ling et al., Nature 321:779-782, 1986; Vale 
15 et al., Nature 321:776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al., 
Proc. Nad. Acad. Sci. USA 83:3091-3095, 1986. 

4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotacric or 
20 chemokinetic activity for mammalian cells, including, for example, monocytes, 

fibroblasts, neutrophils, T-cells, mast cells, eosinophils, epithelial and/or endothelial 
cells. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Chemotactic and chemokinetic receptor activation can be used to mobilize or 
attract a desired cell population to a desired site of action. Chemotactic or chemokinetic 
25 compositions (e.g. proteins, antibodies, binding partners, or modulators of the invention) 
provide particular advantages in treatment of wounds and other trauma to tissues, as well 
as in treatment of localized infections. For example, attraction of lymphocytes, 
monocytes or neutrophils to tumors or sites of infection may result in improved immune 
responses against the tumor or infecting agent. 
30 A protein or peptide has chemotactic activity for a particular cell population if it 

can stimulate, directly or indirectly, the directed orientation or movement of such cell 
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population. Preferably, the protein or peptide has the ability to directly stimulate directed 
movement of cells. Whether a particular protein has chemotactic activity for a population 
of cells can be readily determined by employing such protein or peptide in any known 
assay for cell chemotaxis. 
5 _ Therapeutic compositions of the invention can be used in the following: 

Assays for chemotactic activity (which will identify proteins that induce or 
prevent chemotaxis) consist of assays that measure the ability of a protein to induce the 
migration of cells across a membrane as well as the ability of a protein to induce the 
adhesion of one cell population to another cell population. Suitable assays for movement 

10 and adhesion include, without limitation, those described in: Current Protocols in 

Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. HL Marguiles, E. M. Shevach, W. 
Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 6.12, 
Measurement of alpha and beta Chemokines 6.12.1-6.12.28; Taub et al. J. Clin. Invest. 
95:1370-1376, 1995; Land et al. APMIS 103:140-146, 1995; Muller et al Eur. J. 

15 Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 1994; Johnston et 
al. J. of Immunol. 153:1762-1768, 1994. 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or 
20 thrombolysis or thrombosis. A polynucleotide of the invention can encode a polypeptide 
exhibiting such attributes. Compositions may be useful in treatment of various 
coagulation disorders (including hereditary disorders, such as hemophilias) or to enhance 
coagulation and other hemostatic events in treating wounds resulting from trauma, 
surgery or other causes. A composition of the invention may also be useful for dissolving 
25 or inhibiting formation of thromboses and for treatment and prevention of conditions 
resulting therefrom (such as, for example, infarction of cardiac and central nervous 
system vessels (e.g., stroke). 

Therapeutic compositions of the invention can be used in the following: 
Assay for hemostatic and thrombolytic activity include, without limitation, those 
30 described in: linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., 
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Thrombosis Res. 45:413-419, 1987; Humphrey et aL, Fibrinolysis 5:71-79 (1991); 
Schaub, Prostaglandins 35:467-474, 1988. 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

5 Polypeptides of the invention may be involved in cancer cell generation, 

proliferation or metastasis. Detection of the presence or amount of polynucleotides or 
polypeptides of the invention may be useful for the diagnosis and/or prognosis of one or 
more types of cancer. For example, the presence or increased expression of a 
polynucleotide/polypeptide of the invention may indicate a hereditary risk of cancer, a 

10 precancerous condition, or an ongoing malignancy. Conversely, a defect in the gene or 
absence of the polypeptide may be associated with a cancer condition. Identification of 
single nucleotide polymorphisms associated with cancer or a predisposition to cancer 
may also be useful for diagnosis or prognosis. 

Cancer treatments promote tumor regression by inhibiting tumor cell 

15 proliferation, inhibiting angiogenesis (growth of new blood vessels that is necessary to 
support tumor growth) and/or prohibiting metastasis by reducing tumor cell motility or 
invasiveness. Therapeutic compositions of the invention may be effective in adult and 
pediatric oncology including in solid phase tumors/malignancies, locally advanced 
tumors, human soft tissue sarcomas, metastatic cancer, including lymphatic metastases, 

20 blood cell malignancies including multiple myeloma, acute and chronic leukemias, and 
lymphomas, head and neck cancers including mouth cancer, larynx cancer and thyroid 
cancer, lung cancers including small cell carcinoma and non-small cell cancers, breast 
cancers including small cell carcinoma and ductal carcinoma, gastrointestinal cancers 
including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 

25 associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers 
including bladder cancer and prostate cancer, malignancies of the female genital tract 
including ovarian carcinoma, uterine (including endometrial) cancers, and solid tumor in 
the ovarian follicle, kidney cancers including renal cell carcinoma, brain cancers 
including intrinsic brain tumors, neuroblastoma, astrocytic brain tumors, gliomas, 

30 metastatic tumor cell invasion in the central nervous system, bone cancers including 

osteomas, skin cancers including malignant melanoma, tumor progression of human skin 

64 



WO 02/081731 



PCT/US02/01222 



keratinocytes, squamous cell carcinoma, basal cell carcinoma, hemangiopericytoma and 
Kaiposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention 
(including inhibitors and stimulators of the biological activity of the polypeptide of the 
5 invention) may be administered to treat cancer. Therapeutic compositions can be 

administered in therapeutically effective dosages alone or in combination with adjuvant 
cancer therapy such as surgery, chemotherapy, radiotherapy, thermotherapy, and laser 
therapy, and may provide a beneficial effect, e.g. reducing tumor size, slowing rate of 
tumor growth, inhibiting metastasis, or otherwise improving overall clinical condition, 

10 without necessarily eradicating the cancer. 

The composition can also be administered in therapeutically effective amounts as 
a portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the 
polypeptide or modulator of the invention with one or more anti-cancer drugs in addition 
to a pharmaceutically acceptable carrier for delivery. The use of anti-cancer cocktails as 

15 a cancer treatment is routine. Anti-cancer drugs that are well known in the art and can be 
used as a treatment in combination with the polypeptide or modulator of the invention 
include: Actinomycin D, Aminoglutethimide, Asparaginase, Bleomycin, Busulfan, 
Carboplatin, Caimustine, Chlorambucil, Cisplatin (cis-DDP), Cyclophosphamide, 
Cytarabine HC1 (Cytosine arabinoside), Dacarbazine, Dactinomycin, Daunorubicin HC1, 

20 Doxorubicin HC1, Estramustine phosphate sodium, Etoposide (V16-213), Floxuridine, 5- 
Ruorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, Interferon 
Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-ieieasing factor analog), 
Lomustine, Mechlorethamine HG1 (nitrogen mustard), Melphalan, Mercaptopurine, 
Mesna, Methotrexate (MTX), Mitomycin, Mitoxantrone HC1, Octreotide, Plicamycin, 

25 Procarbazine HC1, Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine 
sulfate, Vincristine sulfate, Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, 
Mitoguazone, Pentostatin, Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for 
prophylactic treatment of cancer. There are hereditary conditions and/or environmental 

30 situations (e.g. exposure to carcinogens) known in the art that predispose an individual to 
developing cancers. Under these circumstances, it may be beneficial to treat these 

65 



WO 02/081731 PCT/US02/01222 



individuals with therapeutically effective doses of the polypeptide of the invention to 
reduce the risk of developing cancers. 

In vitro models can be used to determine the effective doses of the polypeptide of 
the invention as a potential cancer treatment. These in vitro models include proliferation 
5 assays of cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, 
(1987) Culture of Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, 
NY Ch 18 and Ch 21), tumor systems in nude mice as described in Giovanella et al., J. 
Nad. Can. Inst., 52: 921-30 (1974), mobility and invasive potential of tumor cells in 
Boyden Chamber assays as described in Pilkington et al., Anticancer Res., 17: 4107-9 
10 (1997), and angiogenesis assays such as induction of vascularization of the chick 

chorioallantoic membrane or induction of vascular endothelial cell migration as described 
in Ribatta et al., Intl. J. Dev. Biol, 40: 1189-97 (1999) and Li et al., Clin. Exp. 
Metastasis, 17:423-9 (1999), respectively. Suitable tumor cells lines are available, e.g. 
from American Type Tissue Culture Collection catalogs. 

15 

4.10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide 
of the invention can encode a polypeptide exhibiting such characteristics. Examples of 

20 such receptors and ligands include, without limitation, cytokine receptors and their 
ligands, receptor kinases and their ligands, receptor phosphatases and their ligands, 
receptors involved in cell-cell interactions and their ligands (including without limitation, 
cellular adhesion molecules (such as selectins, integrins and their ligands) and 
receptor/ligand pairs involved in antigen presentation, antigen recognition and 

25 development of cellular and humoral immune responses. Receptors and ligands are also 
useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without 
limitation, fragments of receptors and ligands) may themselves be useful as inhibitors of 
receptor/ligand interactions. 

30 The activity of a polypeptide of the invention may, among other means, be 

measured by the following methods: 
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Suitable assays for receptor-ligand activity include without limitation those 
described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, 
D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 
Wiley- Interscience (Chapter 7.28, Measurement of Cellular Adhesion under static 
5 conditions 7.28.1- 7.28.22), Takai et al., Proc. Natl. Acad. Sci. USA 84:6864-6868, 1987; 
Bierer et al., J. Exp. Med. 168: 1 145-1 156, 1988; Rosenstein et al., J. Exp. Med. 
169:149-160 1989; Stoltenborg et al., J. Immunol. Methods 175:59-68, 1994; Stitt et al., 
Cell 80:661-670, 1995. 

By way of example, the polypeptides of the invention may be used as a receptor 

10 for a ligand(s) thereby transmitting the biological activity of that ligand(s). ligands may 
be identified through binding assays, affinity chromatography, dihybrid screening assays, 
BIAcore assays, gel overlay assays, or other methods known in the art. 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists 
or a partial antagonist require the use of other proteins as competing ligands. The 

15 polypeptides of the present invention or ligand(s) thereof may be labeled by being 

coupled to radioisotopes, colorimetric molecules or a toxin molecules by conventional 
methods. ("Guide to Protein Purification" Murray P. Deutscher (ed) Methods in 
Enzymology Vol. 182 (1990) Academic Press, Inc. San Diego). Examples of 
radioisotopes include, but are not limited to, tritium and carbon-14 . Examples of 

20 colorimetric molecules include, but are not limited to, fluorescent molecules such as 
fluorescamine, or rhodamine or other colorimetric molecules. Examples of toxins 
include, but are not limited, to ricin. 

4.10.13 DRUG SCREENING 

25 This invention is particularly useful for screening chemical compounds by using 

the novel polypeptides or binding fragments thereof in any of a variety of drug screening 
techniques. The polypeptides or fragments employed in such a test may either be free in 
solution, affixed to a solid support, borne on a cell surface or located intracellularly. One 
method of drug screening utilizes eukaryotic or prokaryotic host cells which are stably 

30 transformed with recombinant nucleic acids expressing the polypeptide or a fragment 
thereof. Drugs are screened against such transformed cells in competitive binding assays. 
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Such cells, either in viable or fixed form, can be used for standard binding assays. One 
may measure, for example, the formation of complexes between polypeptides of the 
invention or fragments and the agent being tested or examine the diminution in complex 
formation between the novel polypeptides and an appropriate cell line, which are well 
5 known in the art. 

Sources for test compounds that may be screened for ability to bind to or 
modulate (i.e., increase or decrease) the activity of polypeptides of the invention include 
(1) inorganic and organic chemical libraries, (2) natural product libraries, and (3) 
combinatorial libraries comprised of either random or mimetic peptides, oligonucleotides 

10 or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or 
compounds that are identified as "hits" or "leads" via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria 

15 and fungi), animals, plants or other vegetation, or marine organisms, and libraries of 

mixtures for screening may be created by: (1) fermentation and extraction of broths from 
soil, plant or marine microorganisms or (2) extraction of the organisms themselves. 
Natural product libraries include polyketides, non-ribosomal peptides, and (non-naturally 
occurring) variants thereof. For a review, see Science 282:63-6% (1998). 

20 Combinatorial libraries are composed of large numbers of peptides, 

oligonucleotides or organic compounds and can be readily prepared by traditional 
automated synthesis methods, PCR, cloning or proprietary synthetic methods. Of 
particular interest are peptide and oligonucleotide combinatorial libraries. Still other 
libraries of interest include peptide, protein, peptidomimetic, multiparallel synthetic 

25 collection, recombinatorial, and polypeptide libraries. For a review of combinatorial 
chemistry and libraries created therefrom, see Myers, Curr. Opiru Biotechnol 8:701-707 
(1997). For reviews and examples of peptidomimetic libraries, see Al-Obeidi et al., Mol 
Biotechnol 9(3):205-23 (1998); Hruby et al., Curr Opin Chem Biol, 1(1):114-19 (1997); 
Dorner et al., Bioorg Med Chem, 4(5):709-15 (1996) (alkylated dipeptides). 

30 Identification of modulators through use of the various libraries described herein 

permits modification of the candidate "hit" (or "lead") to optimize the capacity of the 
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"hit" to bind a polypeptide of the invention. The molecules identified in the binding assay 
are then tested for antagonist or agonist activity in in vivo tissue culture or animal models 
that are well known in the art. In brief, the molecules are titrated into a plurality of cell 
cultures or animals and then tested for either cell/animal death or prolonged survival of 
5 the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin 
or cholera, or with other compounds that are toxic to cells such as radioisotopes. The 
toxin-binding molecule complex is then targeted to a tumor or other cell by the specificity 
of the binding molecule for a polypeptide of the invention. Alternatively, the binding 
10 molecules may be complexed with imaging agents for targeting and imaging purposes. 

410.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide 
e.g. a ligand or a receptor. The art provides numerous assays particularly useful for 

15 identifying previously unknown binding partners for receptor polypeptides of the 
invention. For example, expression cloning using mammalian or bacterial cells, or 
dihybrid screening assays can be used to identify polynucleotides encoding binding 
partners. As another example, affinity chromatography with the appropriate immobilized 
polypeptide of the invention can be used to isolate polypeptides that recognize and bind 

20 polypeptides of the invention. There are a number of different libraries used for the 
identification of compounds, and in particular small molecules, that modulate (i.e., 
increase or decrease) biological activity of a polypeptide of the invention. ligands for 
receptor polypeptides of the invention can also be identified by adding exogenous 
ligands, or cocktails of ligands to two cells populations that are genetically identical 

25 except for the expression of the receptor of the invention: one cell population expresses 
the receptor of the invention whereas the other does not. Hie response of the two cell 
populations to the addition of ligands(s) are then compared. Alternatively, an expression 
library can be co-expressed with the polypeptide of the invention in cells and assayed for 
an autocrine response to identify potential ligand(s). As still another example, BIAcore 

30 assays, gel overlay assays, or other methods known in the art can be used to identify 

binding partner polypeptides, including, (1) organic and inorganic chemical libraries, (2) 
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natural product libraries, and (3) combinatorial libraries comprised of random peptides, 
oligonucleotides or organic molecules. 

The role of downstream intracellular signaling molecules in the signaling cascade 
of the polypeptide of the invention can be determined. For example, a chimeric protein in 
5 which the cytoplasmic domain of the polypeptide of the invention is fused to the 

extracellular portion of a protein, whose ligand has been identified, is produced in a host 
cell. The cell is then incubated with the ligand specific for the extracellular portion of the 
chimeric protein, thereby activating the chimeric receptor. Known downstream proteins 
involved in intracellular signaling can then be assayed for expected modifications i.e. 
10 phosphorylation. Other methods known to those in the art can also be used to identify 
signaling molecules involved in receptor activity. 

4.10.15 ANTI-INFLAMMATORY ACTIVITY 

Compositions of the present invention may also exhibit anti-inflammatory 

15 activity. The anti-inflammatory activity may be achieved by providing a stimulus to cells 
involved in the inflammatory response, by inhibiting or promoting cell-cell interactions 
(such as, for example, cell adhesion), by inhibiting or promoting chemotaxis of cells 
involved in the inflammatory process, inhibiting or promoting cell extravasation, or by 
stimulating or suppressing production of other factors which more directly inhibit or 

20 promote an inflammatory response. Compositions with such activities can be used to treat 
inflammatory conditions including chronic or acute conditions), including without 
limitation intimation associated with infection (such as septic shock, sepsis or systemic 
inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, endotoxin 
lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 

25 chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting 
from over production of cytokines such as TNF or IL4. Compositions of the invention 
may also be useful to treat anaphylaxis and hypersensitivity to an antigenic substance or 
material. Compositions of this invention may be utilized to prevent or treat conditions 
such as, but not limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced 

30 shock, rheumatoid arthritis, chronic inflammatory arthritis, pancreatic cell damage from 
diabetes mellitus type 1, graft versus host disease, inflammatory bowel disease, 
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inflamation associated with pulmonary disease, other autoimmune disease or 
inflammatory disease, an antiproliferative agent such as for acute or chronic mylegenous 
leukemia or in the prevention of premature labor secondary to intrauterine infections. 

5 4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of 
a therapeutic that promotes or inhibits function of the polynucleotides and/or 
polypeptides of the invention. Such leukemias and related disorders include but are not 
limited to acute leukemia, acute lymphocytic leukemia, acute myelocytic leukemia, 
10 myeloblastic, promyelocytic, myelomonocytic, monocytic, erythroleukemia, chronic 

leukemia, chronic myelocytic (granulocytic) leukemia and chronic lymphocytic leukemia 
(for a review of such disorders, see Fishman et aL, 1985, Medicine, 2d Ed., J.B. 
lippincott Co., Philadelphia). 

15 4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication 
of therapeutic utility, include but are not limited to nervous system injuries, and diseases 

20 or disorders which result in either a disconnection of axons, a diminution or degeneration 
of neurons, or demyelination. Nervous system lesions which may be treated in a patient 
(including human and non-human mammalian patients) according to the invention 
include but are not limited to the following lesions of either the central (including spinal 
cord, brain) or peripheral nervous systems: 

25 (i) traumatic lesions, including lesions caused by physical injury or associated 

with surgery, for example, lesions which sever a portion of the nervous system, or 
compression injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous 
system results in neuronal injury or death, including cerebral infarction or ischemia, or 

30 spinal cord infarction or ischemia; 
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(iii) infectious lesions, in which a portion of the nervous system is destroyed or 
injured as a result of infection, for example, by an abscess or associated with infection by 
human immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme 
disease, tuberculosis, syphilis; 

5 . _ (iv) degenerative lesions, in which a portion of the nervous system is destroyed 

or injured as a result of a degenerative process including but not limited to degeneration 
associated with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or 
amyotrophic lateral sclerosis; 

(v) lesions associated with nutritional diseases or disorders, in which a portion 
10 of the nervous system is destroyed or injured by a nutritional disorder or disorder of 

metabolism including but not limited to, vitamin B12 deficiency, folic acid deficiency, 
Wernicke disease, tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary 
degeneration of the corpus callosum), and alcoholic cerebellar degeneration; 

(vi) neurological lesions associated with systemic diseases including but not 
15 limited to diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, 

carcinoma, or sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

(viii) demyelinated lesions in which a portion of the nervous system is 

20 destroyed or injured by a demyelinating disease including but not limited to multiple 

sclerosis, human immunodeficiency virus-associated myelopathy, transverse myelopathy 
or various etiologies, progressive multifocal leukoencephalopathy, and central pontine 
myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a 
25 nervous system disorder may be selected by testing for biological activity in promoting 
the survival or differentiation of neurons. For example, and not by way of limitation, 
therapeutics which elicit any of the following effects may be useful according to the 
invention: 

(i) increased survival time of neurons in culture; 
30 (ii) increased sprouting of neurons in culture or in vivo; 
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(iii) increased production of a neuron-associated molecule in culture or in vivo, 
e.g., choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 

(iv) decreased symptoms of neuron dysfunction in vivo. 

Such effects may be measured by any method known in the art In preferred, 
5 non-limiting embodiments, increased survival of neurons may be measured by the 

method set forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting 
of neurons may be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 
70:65-82) or Brown et al. (1981, Ann. Rev. Neurosci. 4:17-42); increased production of 
neuron-associated molecules may be measured by bioassay, enzymatic assay, antibody 
10 binding, Northern blot assay, etc., depending on the molecule to be measured; and motor 
neuron dysfunction may be measured by assessing the physical manifestation of motor 
neuron disorder, e.g., weakness, motor neuron conduction velocity, or functional 
disability. 

In specific embodiments, motor neuron disorders that may be treated according to 
15 the invention include but are not limited to disorders such as infarction, infection, 

exposure to toxin, trauma, surgical damage, degenerative disease or malignancy that may 
affect motor neurons as well as other components of the nervous system, as well as 
disorders that selectively affect neurons such as amyotrophic lateral sclerosis, and 
including but not limited to progressive spinal muscular atrophy, progressive bulbar 
20 palsy, primary lateral sclerosis, infantile and juvenile muscular atrophy, progressive 

bulbar paralysis of childhood (Fazio-Londe syndrome), poliomyelitis and the post polio 
syndrome, and Hereditary Motorsensory Neuropathy (Charcot-Marie-Tooth Disease). 

4.10.18 OTHER ACTIVITIES 

25 A polypeptide of the invention may also exhibit one or more of the following 

additional activities or effects: inhibiting the growth, infection or function of, or killing, 
infectious agents, including, without limitation, bacteria, viruses, fungi and other 
parasites; effecting (suppressing or enhancing) bodily characteristics, including, without 
limitation, height, weight, hair color, eye color, skin, fat to lean ratio or other tissue 

30 pigmentation, or organ or body part size or shape (such as, for example, breast 

augmentation or diminution, change in bone form or shape); effecting biorhythms or 
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circadian cycles or rhythms; effecting the fertility of male or female subjects; effecting 
the metabolism, catabolism, anabolism, processing, utilization, storage or elimination of 
dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other nutritional 
factors or component(s); effecting behavioral characteristics, including, without 
5 limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 
(including depressive disorders) and violent behaviors; providing analgesic effects or 
other pain reducing effects; promoting differentiation and growth of embryonic stem cells 
in lineages other than hematopoietic lineages; hormonal or endocrine activity; in the case 
of enzymes, correcting deficiencies of the enzyme and treating deficiency-related 
10 diseases; treatment of hyperproliferative disorders (such as, for example, psoriasis); 
immunoglobulin-like activity (such as, for example, the ability to bind antigens or 
complement); and the ability to act as an antigen in a vaccine composition to raise an 
immune response against such protein or another material or entity which is 
cross-reactive with such protein. 

15 

4.10.19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
polymoiphisms in human subjects and the pharmacogenetic use of this information for 
diagnosis and treatment. Such polymorphisms may be associated with, e.g., differential 

20 predisposition or susceptibility to various disease states (such as disorders involving 

inflammation or immune response) or a differential response to drug administration, and 
this genetic information can be used to tailor preventive or therapeutic treatment 
appropriately. For example, the existence of a polymorphism associated with a 
predisposition to inflammation or autoimmune disease makes possible the diagnosis of 

25 this condition in humans by identifying the presence of the polymorphism. 

Polymorphisms can be identified in a variety of ways known in the art which all 
generally involve obtaining a sample from a patient, analyzing DNA from the sample, 
optionally involving isolation or amplification of the DNA, and identifying the presence 
of the polymorphism in the DNA. For example, PCR may be used to amplify an 

30 appropriate fragment of genomic DNA which may then be sequenced. Alternatively, the 
DNA may be subjected to allele-specific oligonucleotide hybridization (in which 
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appropriate oligonucleotides are hybridized to the DNA under conditions permitting 
detection of a single base mismatch) or to a single nucleotide extension assay (in which 
an oligonucleotide that hybridizes immediately adjacent to the position of the 
polymorphism is extended with one or more labeled nucleotides). In addition, traditional 
5 restriction fragment length polymorphism analysis (using restriction enzymes that 

provide differential digestion of the genomic DNA depending on the presence or absence 
of the polymorphism) may be performed. Arrays with nucleotide sequences of the 
present invention can be used to detect polymorphisms. The array can comprise modified 
nucleotide sequences of the present invention in order to detect the nucleotide sequences 
10 of the present invention. In the alternative, any one of the nucleotide sequences of the 
present invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence 
could also be detected by detecting a corresponding change in amino acid sequence of the 
protein, e.g., by an antibody specific to the variant sequence. 

15 

4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against 
rheumatoid arthritis is determined in an experimental animal model system. The 
experimental model system is adjuvant induced arthritis in rats, and the protocol is 

20 described by J. Holoshitz, et at., 1983, Science, 219:56, or by B. Waksman et al., 1963, 
Int. Arch. Allergy Appl. Immunol., 23: 129. Induction of the disease can be caused by a 
single injection, generally intradermally, of a suspension of killed Mycobacterium 
tuberculosis in complete Freund's adjuvant (CFA). The route of injection can vary, but 
rats may be injected at the base of the tail with an adjuvant mixture. The polypeptide is 

25 administered in phosphate buffered solution (PBS) at a dose of about 1-5 mg/kg. The 
control consists of administering PBS only. 

The procedure for testing the effects of the test compound would consist of 
intradermally injecting killed Mycobacterium tuberculosis in CFA followed by 
immediately administering the test compound and subsequent treatment every other day 

30 until day 24. At 14, 15, 18, 20, 22, and 24 days after injection of Mycobacterium CFA, an 
overall arthritis score may be obtained as described by J. Holoskitz above. An analysis of 
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the data would reveal that the test compound would have a dramatic affect on the 
swelling of the joints as measured by a decrease of the arthritis score. 

5 4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and 
antibodies or other binding partners or modulators including antisense polynucleotides) 
of the invention have numerous applications in a variety of therapeutic methods. 
Examples of therapeutic applications include, but are not limited to, those exemplified 
10 herein. 

4.11.1 EXAMPLE 

One embodiment of the invention is the administration of an effective amount of 
the polypeptides or other composition of the invention to individuals affected by a 

1 5 disease or disorder that can be modulated by regulating the peptides of the invention. 
While the mode of administration is not particularly important, parenteral administration 
is preferred. "An exemplary mode of administration is to deliver an intravenous bolus. 
The dosage of the polypeptides or other composition of the invention will normally be 
determined by the prescribing physician. It is to be expected that the dosage will vary 

20 according to the age, weight, condition and response of the individual patient. Typically, 
the amount of polypeptide administered per dose will be in the range of about O.Oljigfleg 
to 100 mg/kg of body weight, with the preferred dose being about O.l^g/kg to 10 mg/kg 
of patient body weight. For parenteral administration, polypeptides of the invention will 
be formulated in an injectable form combined with a pharmaceutical^ acceptable 

25 parenteral vehicle. Such vehicles are well known in the art and examples include water, 
saline, Ringer's solution, dextrose solution, and solutions consisting of small amounts of 
the human serum albumin. The vehicle may contain minor amounts of additives that 
maintain the isotonicity and stability of the polypeptide or other active ingredient The 
preparation of such solutions is within the skill of the art. 

30 
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4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source 
derived, including without limitation from recombinant and non-recombinant sources and 
5 including antibodies and other binding partners of the polypeptides of the invention) may 
be administered to a patient in need, by itself, or in pharmaceutical compositions where it 
is mixed with suitable earners or excipient(s) at doses to treat or ameliorate a variety of 
disorders. Such a composition may optionally contain (in addition to protein or other 
active ingredient and a carrier) diluents, fillers, salts, buffers, stabilizers, solubilizers, and 

10 other materials well known in the art. The term "pharmaceutical^ acceptable" means a 
non-toxic material that does not interfere with the effectiveness of the biological activity 
of the active ingredient(s). The characteristics of the carrier will depend on the route of 
administration. The pharmaceutical composition of the invention may also contain 
cytokines, lymphokines, or other hematopoietic factors such as M-CSF, GM-CSF, TNF, 

15 IL-1, IU2, IL-3, IL-4, IL-5, DL-6, 1L-7, IL-8, IL-9, IL-10, IL-11, IL-12, EL-13, IL-14, 
IL-15, IFN, TNFO, TNF1, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell factor, 
and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These 
agents include various growth factors such as epidermal growth factor (EGF), 

20 platelet-derived growth factor (PDGF), transforming growth factors (TGF-a and TGF-0), 
insulin-like growth factor (IGF), as well as cytokines described herein. 

The pharmaceutical composition may further contain other agents which either 
enhance the activity of the protein or other active ingredient or complement its activity or 
use in treatment. Such additional factors and/or agents may be included in the 

25 pharmaceutical composition to produce a synergistic effect with protein or other active 
ingredient of the invention, or to minimize side effects. Conversely, protein or other 
active ingredient of the present invention may be included in formulations of the 
particular clotting factor, cytokine, lymphokine, other hematopoietic factor, thrombolytic 
or anti-thrombotic factor, or anti- inflammatory agent to minimize side effects of the 

30 clotting factor, cytokine, lymphokine, other hematopoietic factor, thrombolytic or 

anti-thrombotic factor, or anti-inflammatory agent (such as IL-IRa, IL-1 Hyl, IL-1 Hy2, 
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anti-TNF, corticosteroids, immunosuppressive agents). A protein of the present 
invention may be active in multimers (e.g., heterodimers or homodimers) or complexes 
with itself or other proteins. As a result, pharmaceutical compositions of the invention 
may comprise a protein of the invention in such multimeric or complexed form. 
5 As an alternative to being included in a pharmaceutical composition of the 

invention including a first protein, a second protein or a therapeutic agent may be 
concurrently administered with the first protein (e.g., at the same time, or at differing 
times provided that therapeutic concentrations of the combination of agents is achieved at 
the treatment site). Techniques for formulation and administration of the compounds of 

10 the instant application may be found in "Remington's Pharmaceutical Sciences," Mack 
Publishing Co., Easton, PA, latest edition. A therapeutically effective dose further refers 
to that amount of the compound sufficient to result in amelioration of symptoms, e.g., 
treatment, healing, prevention or amelioration of the relevant medical condition, or an 
increase in rate of treatment, healing, prevention or amelioration of such conditions. 

15 When applied to an individual active ingredient, administered alone, a therapeutically 
effective dose refers to that ingredient alone. When applied to a combination, a 
therapeutically effective dose refers to combined amounts of the active ingredients that 
result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

20 In practicing the method of treatment or use of the present invention, a 

therapeutically effective amount of protein or other active ingredient of the present 
invention is administered to a mammal having a condition to be treated. Protein or other 
active ingredient of the present invention may be administered in accordance with the 
method of the invention either alone or in combination with other therapies such as 

25 treatments employing cytokines, lymphokines or other hematopoietic factors. When co- 
administered with one or more cytokines, lymphokines or other hematopoietic factors, 
protein or other active ingredient of the present invention may be administered either 
simultaneously with the cytokine(s), lymphokine(s), other hematopoietic factors), 
thrombolytic or anti-thrombotic factors, or sequentially. If administered sequentially, the 

30 attending physician will decide on the appropriate sequence of administering protein or 
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other active ingredient of the present invention in combination with cytokine(s), 
lymphokine(s), other hematopoietic factors), thrombolytic or anti-thrombotic factors. 

4.12.1 ROUTES OF ADMINISTRATION 

5 Suitable routes of administration may, for example, include oral, rectal, 

transmucosal, or intestinal administration; parenteral delivery, including intramuscular, 
subcutaneous, intramedullary injections, as well as intrathecal, direct intraventricular, 
intravenous, intraperitoneal, intranasal, or intraocular injections. Administration of 
protein or other active ingredient of the present invention used in the pharmaceutical 

10 composition or to practice the method of the present invention can be carried out in a 
variety of conventional ways, such as oral ingestion, inhalation, topical application or 
cutaneous, subcutaneous, intraperitoneal, parenteral or intravenous injection. Intravenous 
administration to the patient is preferred. 

Alternately, one may administer the compound in a local rather than systemic 

15 manner, for example, via injection of the compound directly into a arthritic joints or in 
fibrotic tissue, often in a depot or sustained release formulation. In order to prevent the 
scarring process frequently occurring as complication of glaucoma surgery, the 
compounds may be administered topically, for example, as eye drops. Furthermore, one 
may administer the drug in a targeted drug delivery system, for example, in a liposome 

20 coated with a specific antibody, targeting, for example, arthritic or fibrotic tissue. The 
liposomes will be targeted to and taken up selectively by the afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an 
effective dosage to the desired site of action. The determination of a suitable route of 
administration and an effective dosage for a particular indication is within the level of 

25 skill in the art Preferably for wound treatment, one administers the therapeutic 
compound direcdy to the site. Suitable dosage ranges for the polypeptides of the 
invention can be extrapolated from these dosages or from similar studies in appropriate 
animal models. Dosages can then be adjusted as necessary by the clinician to provide 
maximal therapeutic benefit. 

30 

4.12.2 COMPOSITIONS/FORMULATIONS 
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Pharmaceutical compositions for use in accordance with the present invention 
thus may be formulated in a conventional manner using one or more physiologically 
acceptable carriers comprising excipients and auxiliaries which facilitate processing of 
the active compounds into preparations which can be used pharmaceutically. These 
5 pharmaceutical compositions may be manufactured in a manner that is itself known, e.g., 
by means of conventional mixing, dissolving, granulating, dragee-making, levigating, 
emulsifying, encapsulating, entrapping or lyophilizing processes. Proper formulation is 
dependent upon the route of administration chosen. When a therapeutically effective 
amount of protein or other active ingredient of the present invention is administered 

10 orally, protein or other active ingredient of the present invention will be in the form of a 
tablet, capsule, powder, solution or elixir. When administered in tablet form, the 
pharmaceutical composition of the invention may additionally contain a solid carrier such 
as a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% 
protein or other active ingredient of the present invention, and preferably from about 25 

15 to 90% protein or other active ingredient of the present invention. When administered in 
liquid form, a liquid carrier such as water, petroleum, oils of animal or plant origin such 
as peanut oil, mineral oil, soybean oil, or sesame oil, or synthetic oils may be added. The 
liquid form of the pharmaceutical composition may further contain physiological saline 
solution, dextrose or other saccharide solution, or glycols such as ethylene glycol, 

20 propylene glycol or polyethylene glycol. When administered in liquid form, the 

pharmaceutical composition contains from about 0.5 to 90% by weight of protein or other 
active ingredient of the present invention, and preferably from about 1 to 50% protein or 
other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of 

25 the present invention is administered by intravenous, cutaneous or subcutaneous 

injection, protein or other active ingredient of the present invention will be in the form of 
a pyrogen-free, parenterally acceptable aqueous solution. The preparation of such 
parenterally acceptable protein or other active ingredient solutions, having due regard to 
pH, isotonicity, stability, and the like, is within the skill in the art. A preferred 

30 pharmaceutical composition for intravenous, cutaneous, or subcutaneous injection should 
contain, in addition to protein or other active ingredient of the present invention, an 
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isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, Dextrose 
Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or other 
vehicle as known in the art. The pharmaceutical composition of the present invention 
may also contain stabilizers, preservatives, buffers, antioxidants, or other additives 
5 known to those of skill in the art. For injection, the agents of the invention may be 

formulated in aqueous solutions, preferably in physiologically compatible buffers such as 
Hanks's solution, Ringer's solution, or physiological saline buffer. For transmucosal 
administration, penetrants appropriate to the barrier to be permeated are used in the 
formulation. Such penetrants are generally known in the art. 

10 For oral administration, the compounds can be formulated readily by combining 

the active compounds with pharmaceutically acceptable carriers well known in the art. 
Such carriers enable the compounds of the invention to be formulated as tablets, pills, 
dragees, capsules, liquids, gels, syrups, slurries, suspensions and the like, for oral 
ingestion by a patient to be treated. Pharmaceutical preparations for oral use can be 

15 obtained from a solid excipient, optionally grinding a resulting mixture, and processing 
the mixture of granules, after adding suitable auxiliaries, if desired, to obtain tablets or 
dragee cores. Suitable excipients are, in particular, fillers such as sugars, including 
lactose, sucrose, mannitol, or sorbitol; cellulose preparations such as, for example, maize 
starch, wheat starch, rice starch, potato starch, gelatin, gum tragacanth, methyl cellulose, 

20 hydroxypropylmethyl-cellulose, sodium carboxymethylcellulose, and/or 

polyvinylpyrrolidone (PVP). If desired, disintegrating agents may be added, such as the 
cross-linked polyvinyl pynrolidone, agar, or alginic acid or a salt thereof such as sodium 
alginate. Dragee cores are provided with suitable coatings. For this purpose, 
concentrated sugar solutions maybe used, which may optionally contain gum arabic, talc, 

25 polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may 
be added to the tablets or dragee coatings for identification or to characterize different 
combinations of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules 

30 made of gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as 
glycerol or sorbitol. The push-fit capsules can contain the active ingredients in admixture 
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with filler such as lactose, binders such as starches, and/or lubricants such as talc or 
magnesium stearate and, optionally, stabilizers. In soft capsules, the active compounds 
may be dissolved or suspended in suitable liquids, such as fatty oils, liquid paraffin, or 
liquid polyethylene glycols. In addition, stabilizers may be added All foimulations for 
5 oral administration should be in dosages suitable for such administration. For buccal 
administration, the compositions may take the form of tablets or lozenges formulated in 
conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 

10 pressurized packs or a nebuliser, with the use of a suitable propellant, e.g.* 

dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon 
dioxide or other suitable gas. In the case of a pressurized aerosol the dosage unit may be 
determined by providing a valve to deliver a metered amount. Capsules and cartridges 
of, e.g. , gelatin for use in an inhaler or insufflator may be formulated containing a powder 

15 mix of the compound and a suitable powder base such as lactose or starch. The 

compounds may be formulated for parenteral administration by injection, e.g., by bolus 
injection or continuous infusion. Formulations for injection may be presented in unit 
dosage form, e.g., in ampules or in multi-dose containers, with an added preservative. 
The compositions may take such forms as suspensions, solutions or emulsions in oily or 

20 aqueous vehicles, and may contain formulatory agents such as suspending, stabilizing 
and/or dispersing agents. 

Phannaceutical formulations for parenteral administration include aqueous 
solutions of the active compounds in water-soluble form. Additionally, suspensions of 
the active compounds may be prepared as appropriate oily injection suspensions. 

25 Suitable lipophilic solvents or vehicles include fatty oils such as sesame oil, or synthetic 
fatty acid esters, such as ethyl oleate or triglycerides, or liposomes. Aqueous injection 
suspensions may contain substances which increase the viscosity of the suspension, such 
as sodium carboxymethyl cellulose, sorbitol, or dextran. Optionally, the suspension may 
also contain suitable stabilizers or agents which increase the solubility of the compounds 

30 to allow for the preparation of highly concentrated solutions. Alternatively, the active 
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ingredient may be in powder form for constitution with a suitable vehicle, e.g., sterile 
pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as 
suppositories or retention enemas, e.g., containing conventional suppository bases such as 
5 cocoa butter or other glycerides. In addition to the formulations described previously, the 
compounds may also be formulated as a depot preparation. Such long acting 
formulations may be administered by implantation (for example subcutaneously or 
intramuscularly) or by intramuscular injection. Thus, for example, the compounds may 
be formulated with suitable polymeric or hydrophobic materials (for example as an 

10 emulsion in an acceptable oil) or ion exchange resins, or as sparingly soluble derivatives, 
for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co- 
solvent system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible 
organic polymer, and an aqueous phase. The co-solvent system may be the VPD 

15 co-solvent system. VPD is a solution of 3% w/v benzyl alcohol, 8% w/v of the nonpolar 
surfactant polysorbate 80, and 65% w/v polyethylene glycol 300, made up to volume in 
absolute ethanol. The VPD co-solvent system (VPD:5W) consists of VPD diluted 1:1 
with a 5% dextrose in water solution. This co-solvent system dissolves hydrophobic 
compounds well, and itself produces low toxicity upon systemic administration. 

20 Naturally, the proportions of a co-solvent system may be varied considerably without 
destroying its solubility and toxicity characteristics. Furthermore, the identity of the 
co-solvent components may be varied: for example, other low-toxicity nonpolar 
surfactants may be used instead of polysorbate 80; the fraction size of polyethylene 
glycol may be varied; other biocompatible polymers may replace polyethylene glycol, 

25 e.g. polyvinyl pyrrolidone; and other sugars or polysaccharides may substitute for 
dextrose. Alternatively, other delivery systems for hydrophobic pharmaceutical 
compounds may be employed. Liposomes and emulsions are well known examples of 
delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents such as 
dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 

30 Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
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Various types of sustained-release materials have been established and are well known by 
those skilled in the art. Sustained-release capsules may, depending on their chemical 
nature, release the compounds for a few weeks up to over 100 days. Depending on the 
chemical nature and the biological stability of the therapeutic reagent, additional 
5 strategies for protein or other active ingredient stabilization may be employed. 

The pharmaceutical compositions also may comprise suitable solid or gel phase 
carriers or excipients. Examples of such carriers or excipients include but are not limited 
to calcium carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, 
gelatin, and polymers such as polyethylene glycols. Many of the active ingredients of the 

10 invention may be provided as salts with pharmaceutically compatible counter ions. Such 
pharmaceutical^ acceptable base addition salts are those salts which retain the biological 
effectiveness and properties of the free acids and which are obtained by reaction with 
inorganic or organic bases such as sodium hydroxide, magnesium hydroxide, ammonia, 
trialkylamine, dialkylamine, monoalkylamine, dibasic amino acids, sodium acetate, 

15 potassium benzoate, triethanol amine and the like. 

The pharmaceutical composition of the invention may be in the form of a 
complex of the protein(s) or other active ingredient(s) of present invention along with 
protein or peptide antigens. The protein and/or peptide antigen will deliver a stimulatory 
signal to both B and T lymphocytes. B lymphocytes will respond to antigen through their 

20 surface immunoglobulin receptor. T lymphocytes will respond to antigen through the T 
cell receptor (TCR) following presentation of the antigen by MHC proteins. MHC and 
structurally related proteins including those encoded by class I and class II MHC genes 
on host cells will serve to present the peptide antigen(s) to T lymphocytes. The antigen 
components could also be supplied as purified MHC-peptide complexes alone or with 

25 co-stimulatory molecules that can directly signal T cells. Alternatively antibodies able to 
bind surface immunoglobulin and other molecules on B cells as well as antibodies able to 
bind the TCR and other molecules on T cells can be combined with the pharmaceutical 
composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a 

30 liposome in which protein of the present invention is combined, in addition to other 

pharmaceutical^ acceptable carriers, with amphipathic agents such as lipids which exist 
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in aggregated form as micelles, insoluble monolayers, liquid crystals, or lamellar layers 
in aqueous solution. Suitable lipids for liposomal formulation include, without limitation, 
monoglycerides, diglycerides, sulfatides, lysolecithins, phospholipids, saponin, bile acids, 
and the like. Preparation of such liposomal formulations is within the level of skill in the 
5 art, as disclosed, for example, in U.S. Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 
4,737,323, all of which are incoiporated herein by reference. 

The amount of protein or other active ingredient of the present invention in the 
pharmaceutical composition of the present invention will depend upon the nature and 
severity of the condition being treated, and on the nature of prior treatments which the 

10 patient has undergone. Ultimately, the attending physician will decide the amount of 
protein or other active ingredient of the present invention with which to treat each 
individual patient. Initially, the attending physician will administer low doses of protein 
or other active ingredient of the present invention and observe the patient's response. 
Larger doses of protein or other active ingredient of the present invention may be 

15 administered until the optimal therapeutic effect is obtained for the patient, and at that 
point the dosage is not increased further. It is contemplated that the various 
pharmaceutical compositions used to practice the method of the present invention should 
contain about 0.01 \ig to about 100 mg (preferably about 0.1 fig to about 10 mg, more 
preferably about 0.1 \ig to about 1 mg) of protein or other active ingredient of the present 

20 invention per kg body weight. For compositions of the present invention which are 
useful for bone, cartilage, tendon or ligament regeneration, the therapeutic method 
includes administering the composition topically, systematically, or locally as an implant 
or device. When administered, the therapeutic composition for use in this invention is, of 
course, in a pyrogen-free, physiologically acceptable form. Further, the composition may 

25 desirably be encapsulated or injected in a viscous form for delivery to the site of bone, 
cartilage or tissue damage. Topical administration may be suitable for wound healing 
and tissue repair. Therapeutically useful agents other than a protein or other active 
ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionally, be administered simultaneously or 

30 sequentially with the composition in the methods of the invention. Preferably for bone 

and/or cartilage formation, the composition would include a matrix capable of delivering 

« 
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the protein-containing or other active ingredient-containing composition to the site of 
bone and/or cartilage damage, providing a structure for the developing bone and cartilage 
and optimally capable of being resorted into the body. Such matrices may be formed of 
materials presently in use for other implanted medical applications. 
5 The choice of matrix material is based on biocompatibility, biodegradability, 

mechanical properties, cosmetic appearance and interface properties. The particular 
application of the compositions will define the appropriate formulation. Potential 
matrices for the compositions may be biodegradable and chemically defined calcium 
sulfate, tricalcium phosphate, hydroxyapatite, polylactic acid, polyglycolic acid and 

10 polyanhydrides. Other potential materials are biodegradable and biologically 

well-defined, such as bone or dermal collagen. Further matrices are comprised of pure 
proteins or extracellular matrix components. Other potential matrices are 
nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 
aluminates, or other ceramics. Matrices may be comprised of combinations of any of the 

15 above mentioned types of material, such as polylactic acid and hydroxyapatite or 

collagen and tricalcium phosphate. The bioceramics may be altered in composition, such 
as in calcium-aluminate-phosphate and processing to alter pore size, particle size, particle 
shape, and biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of 
lactic acid and glycolic acid in the form of porous particles having diameters ranging 

20 from 150 to 800 microns. In some applications, it will be useful to utilize a sequestering 
agent, such as caiboxymethyl cellulose or autologous blood clot, to prevent the protein 
compositions from disassociating from the matrix. 

A preferred family of sequestering agents is cellulosic materials such as 
alkylcelluloses (including hydroxyalkylcelluloses), including methylcellulose, 

25 ethylcellulose, hydroxyethylcellulose, hydroxypropylcellulose, 

hydroxypropyl-methylcellulose, and carboxymethylcellulose, the most preferred being 
cationic salts of carboxymethylcellulose (CMC). Other preferred sequestering agents 
include hyaluronic acid, sodium alginate, poly(ethylene glycol), polyoxyethylene oxide, 
carboxyvinyl polymer and poly(vinyl alcohol). The amount of sequestering agent useful 

30 herein is 0.5-20 wt %, preferably 1-10 wt % based on total formulation weight, which 
represents the amount necessary to prevent desoiption of the protein from the polymer 
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matrix and to provide appropriate handling of the composition, yet not so much that the 
progenitor cells are prevented from infiltrating the matrix, thereby providing the protein 
the opportunity to assist the osteogenic activity of the progenitor cells. In further 
compositions, proteins or other active ingredients of the invention may be combined with 
5 other agents beneficial to the treatment of the bone and/or cartilage defect, wound, or 
tissue in question. These agents include various growth factors such as epidermal growth 
factor (EGF), platelet derived growth factor (PDGF), transforming growth factors 
(TGF-a and TGF-0), and insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary 

10 applications. Particularly domestic animals and thoroughbred horses, in addition to 

humans, are desired patients for such treatment with proteins or other active ingredients 
of the present invention. The dosage regimen of a protein-containing pharmaceutical 
composition to be used in tissue regeneration will be determined by the attending 
physician considering various factors which modify the action of the proteins, e.g., 

15 amount of tissue weight desired to be formed, the site of damage, the condition of the 
damaged tissue, the size of a wound, type of damaged tissue (e.g., bone), the patient's 
age, sex, and diet, the severity of any infection, time of administration and other clinical 
factors. The dosage may vary with the type of matrix used in the reconstitution and with 
inclusion of other proteins in the pharmaceutical composition. For example, the addition 

20 of other known growth factors, such as IGF I (insulin like growth factor I), to the final 
composition, may also effect the dosage. Progress can be monitored by periodic 
assessment of tissue/bone growth and/or repair, for example, X-rays, histomorphometric 
determinations and tetracycline labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 

25 polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
mammalian subject. Polynucleotides of the invention may also be administered by other 
known methods for introduction of nucleic acid into a cell or organism (including, 
without limitation, in the form of viral vectors or naked DNA). Cells may also be 
cultured ex vivo in the presence of proteins of the present invention in order to proliferate 

30 or to produce a desired effect on or activity in such cells. Treated cells can then be 
introduced in vivo for therapeutic purposes. 
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4.123 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to 
5 achieve its intended puipose. -More specifically, a therapeutically effective amount 
means an amount effective to prevent development of or to alleviate the existing 
symptoms of the subject being treated. Determination of the effective amount is well 
within the capability of those skilled in the art, especially in light of the detailed 
disclosure provided herein. For any compound used in the method of the invention, the 

10 therapeutically effective dose can be estimated initially from appropriate in vitro assays. 
For example, a dose can be formulated in animal models to achieve a circulating 
concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve a 
circulating concentration range that includes the IC50 as determined in cell culture (Le. 9 

15 the concentration of the test compound which achieves a half-maximal inhibition of the 
protein's biological activity). Such information can be used to more accurately determine 
useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results 
in amelioration of symptoms or a prolongation of survival in a patient. Toxicity and 

20 therapeutic efficacy of such compounds can be determined by standard pharmaceutical 
procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the 
dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 
50% of the population). The dose ratio between toxic and therapeutic effects is the 
therapeutic index and it can be expressed as the ratio between LD50 and ED50. 

25 Compounds which exhibit high therapeutic indices are preferred. Hie data obtained from 
these cell culture assays and animal studies can be used in formulating a range of dosage 
for use in human. The dosage of such compounds lies preferably within a range of 
circulating concentrations that include the ED50 with little or no toxicity. Hie dosage 
may vary within this range depending upon the dosage form employed and the route of 

30 administration utilized. The exact formulation, route of administration and dosage can be 
chosen by the individual physician in view of the patient's condition. See, e.g., Fingl et 
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al., 1975, in "The Pharmacological Basis of Therapeutics", Ch. 1 p.l. Dosage amount 

and interval may be adjusted individually to provide plasma levels of the active moiety 

which are sufficient to maintain the desired effects, or minimal effective concentration 

(MEC). The MEC will vary for each compound but can be estimated from in vitro data. 
5 Dosages necessary to. achieve the MEC will depend on.individual characteristics and 

route of administration. However, HPLC assays or bioassays can be used to determine 

plasma concentrations. 

Dosage intervals can also be determined using MEC value. Compounds should 

be administered using a regimen which maintains plasma levels above the MEC for 
10 10-90% of the time, preferably between 30-90% and most preferably between 50-90%. 

In cases of local administration or selective uptake, the effective local concentration of 

the drug may not be related to plasma concentration. 

An exemplary dosage regimen for polypeptides or other compositions of the 

invention will be in the range of about 0.01 fig/kg to 100 mg/kg of body weight daily, 
15 with the preferred dose being about 0.1 pg/kg to 25 mg/kg of patient body weight daily, 

varying in adults and children. Dosing may be once daily, or equivalent doses may be 

delivered at longer or shorter intervals. 

The amount of composition administered will, of course, be dependent on the 

subject being treated, on the subject's age and weight, the severity of the affliction, the 
20 manner of administration and the judgment of the prescribing physician. 

4.12.4 PACKAGING 

The compositions may, if desired, be presented in a pack or dispenser device 
which may contain one or more unit dosage forms containing the active ingredient. The 
25 pack may, for example, comprise metal or plastic foil, such as a blister pack. The pack or 
dispenser device may be accompanied by instructions for administration. Compositions 
comprising a compound of the invention formulated in a compatible pharmaceutical 
carrier may also be prepared, placed in an appropriate container, and labeled for 
treatment of an indicated condition. 
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Also included in the invention are antibodies to proteins, or fragments of proteins 
of the invention. The term "antibody" as used herein refers to immunoglobulin molecules 
and immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules 
that contain an antigen-binding site that specifically binds (immunoreacts with) an 
5 antigen. Such antibodies include, but are not limited to, polyclonal, monoclonal, 
chimeric, single chain, F a b, F a t>' and F^^ fragments, and an F a b expression library. In 
general, an antibody molecule obtained from humans relates to any of the classes IgG, 
IgM, IgA, IgE and IgD, which differ from one another by the nature of the heavy chain 
present in the molecule. Certain classes have subclasses as well, such as IgGi, IgG2, and 

10 others. Furthermore, in humans, the light chain may be a kappa chain or a lambda chain. 
Reference herein to antibodies includes a reference to all such classes, subclasses and 
types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an 
antigen, or a portion or fragment thereof, and additionally can be used as an immunogen 

15 to generate antibodies that immunospecifically bind the antigen, using standard 

techniques for polyclonal and monoclonal antibody preparation. The full-length protein 
can be used or, alternatively, the invention provides antigenic peptide fragments of the 
antigen for use as immunogens. An antigenic peptide fragment comprises at least 6 
amino acid residues of the amino acid sequence of the full length protein, such as an 

20 amino acid sequence shown in SEQ ID NO: 1-438, and encompasses an epitope thereof 
such that an antibody raised against the peptide forms a specific immune complex with 
the full length protein or with any fragment that contains the epitope. Preferably, the 
antigenic peptide comprises at least 10 amino acid residues, or at least 15 amino acid 
residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 

25 epitopes encompassed by the antigenic peptide are regions of the protein that are located 
on its surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a region of alpha-2-macroglobulin-like protein that is located on the 
surface of the protein, e.g., a hydrophilic region. A hydrophobicity analysis of the human 

30 related protein sequence will indicate which regions of a related protein are particularly 
hydrophilic and, therefore, are likely to encode surface residues useful for targeting 
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antibody production. As a means for targeting antibody production, hydropathy plots 
showing regions of hydrophilicity and hydrophobicity may be generated by any method 
well known in the art, including, for example, the Kyte Doolittle or the Hopp Woods 
methods, either with or without Fourier transformation. See, e.g., Hopp and Woods, 
5 1981, Proa Nat. Acad. Sci. USA 78: 3824-3828; Kyte and Doolittle 1982, J. Mol Biol 
157: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or 
derivatives, fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 

10 thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

The term "specific for" indicates that the variable regions of the antibodies of the 
invention recognize and bind polypeptides of the invention exclusively (ie., able to 
distinguish the polypeptide of the invention from other similar polypeptides despite 

15 sequence identity, homology, or similarity found in the family of polypeptides), but may 
also interact with other proteins (for example, S. aureus protein A or other antibodies in 
ELISA techniques) through interactions with sequences outside the variable region of the 
antibodies, and in particular, in the constant region of the molecule. Screening assays to 
determine binding specificity of an antibody of the invention are well known and 

20 routinely practiced in the art. For a comprehensive discussion of such assays, see Harlow 
et al. (Eds), Antibodies A Laboratory Manual; Cold Spring Harbor Laboratory; Cold 
Spring Harbor, NY (1988), Chapter 6. Antibodies that recognize and bind fragments of 
the polypeptides of the invention are also contemplated, provided that the antibodies are 
first and foremost specific for, as defined above, full-length polypeptides of the 

25 invention. As with antibodies that are specific for full length polypeptides of the 
invention, antibodies of the invention that recognize fragments are those which can 
distinguish polypeptides from the same family of polypeptides despite inherent sequence 
identity, homology, or similarity found in the family of proteins. 

Antibodies of the invention are useful for, for example, therapeutic purposes (by 

30 modulating activity of a polypeptide of the invention), diagnostic purposes to detect or 
quantitate a polypeptide of the invention, as well as purification of a polypeptide of the 
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invention. Kits comprising an antibody of the invention for any of the purposes 
described herein are also comprehended In general, a kit of the invention also includes a 
control antigen for which the antibody is immunospecific. The invention further provides 
a hybridoma that produces an antibody according to the invention. Antibodies of the 
5 invention are useful for detection and/or purification of the polypeptides of the invention. 
Monoclonal antibodies binding to the protein of the invention may be useful 
diagnostic agents for the immunodetection of the protein. Neutralizing monoclonal 
antibodies binding to the protein may also be useful therapeutics for both conditions 
associated with the protein and also in the treatment of some forms of cancer where 

10 abnormal expression of the protein is involved. In the case of cancerous cells or 

leukemic cells, neutralizing monoclonal antibodies against the protein may be useful in 
detecting and preventing the metastatic spread of the cancerous cells, which may be 
mediated by the protein. 

The labeled antibodies of the present invention can be used for in vitro, in vivo, 

15 and in situ assays to identify cells or tissues in which a fragment of the polypeptide of 
interest is expressed. The antibodies may also be used directly in therapies or other 
diagnostics. The present invention further provides the above-described antibodies 
immobilized on a solid support. Examples of such solid supports include plastics such as 
polycarbonate, complex carbohydrates such as agarose and Sepharose®, acrylic resins 

20 and such as polyacrylamide and latex beads. Techniques for coupling antibodies to such 
solid supports are well known in the art (Weir, D.M. et aL v "Handbook of Experimental 
Immunology" 4th Ed, Blackwell Scientific Publications, Oxford, England, Chapter 10 
(1986); Jacoby, WD. et al., Meth. Enzym. 34 Academic Press, N.Y. (1974)). The 
immobilized antibodies of the present invention can be used for in vitro, in vivo, and in 

25 situ assays as well as for immuno-affinity purification of the proteins of the present 
invention. 

Various procedures known within the art may be used for the production of 
polyclonal or monoclonal antibodies directed against a protein of the invention, or against 
derivatives, fragments, analogs homologs or orthologs thereof (see, for example, 
30 Antibodies: A Laboratory Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor 
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Laboratory Press, Cold Spring Harbor, NY, incorporated herein by reference). Some of 
these antibodies are discussed below. 

4.13.1 POLYCLONAL ANTIBODIES 

5 For the production of polyclonal antibodies, various suitable host animals (e.g., 

rabbit, goat, mouse or other mammal) may be immunized by one or more injections with 
the native protein, a synthetic variant thereof, or a derivative of the foregoing. An 
appropriate immunogenic preparation can contain, for example, the naturally occurring 
immunogenic protein, a chemically synthesized polypeptide representing the 

10 immunogenic protein, or a recombinantly expressed immunogenic protein. Furthermore, 
the protein may be conjugated to a second protein known to be immunogenic in the 
mammal being immunized. Examples of such immunogenic proteins include but are not 
limited to keyhole limpet hemocyanin, serum albumin, bovine thyroglobulin, and 
soybean trypsin inhibitor. The preparation can further include an adjuvant. Various 

15 adjuvants used to increase the immunological response include, but are not limited to, 
Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface- 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 

20 adjuvants that can be employed include MPL-TDM adjuvant (monophosphoryl lipid A, 
synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can 
be isolated from the mammal (e.g., from the blood) and further purified by well known 
techniques, such as affinity chromatography using protein A or protein G, which provide 

25 primarily the IgG fraction of immune serum. Subsequently, or alternatively, the specific 
antigen which is the target of the immunoglobulin sought, or an epitope thereof, may be 
immobilized on a column to purify the immune specific antibody by immunoaffinity 
chromatography. Purification of immunoglobulins is discussed, for example, by D. 
Wilkinson (The Scientist, published by The Scientist, Inc., Philadelphia PA, Vol. 14, No. 

30 8 (April 17, 2000), pp. 25-28). 
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4.13.2 MONOCLONAL ANTIBODIES 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", 
as used herein, refers to a population of antibody molecules that contain only one 
molecular species of antibody molecule consisting of a unique light chain gene product 
5 and a unique heavy chain gene product. In particular, the complementarity determining 
regions (CDRs) of the monoclonal antibody are identical in all the molecules of the 
population. MAbs thus contain an antigen-binding site capable of immunoreacting with a 
particular epitope of the antigen characterized by a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 

10 described by Kohler and Milstein, Nature. 256:495 (1975). In a hybridoma method, a 
mouse, hamster, or other appropriate host animal, is typically immunized with an 
immunizing agent to elicit lymphocytes that produce or are capable of producing 
antibodies that will specifically bind to the immunizing agent. Alternatively, the 
. lymphocytes can be immunized in vitro. 

15 The immunizing agent will typically include the protein antigen, a fragment 

thereof or a fusion protein thereof. Generally, either peripheral blood lymphocytes are 
used if cells of human origin are desired, or spleen cells or lymph node cells are used if 
non-human mammalian sources are desired. The lymphocytes are then fused with an 
immortalized cell line using a suitable fusing agent, such as polyethylene glycol, to form 

20 a hybridoma cell (Goding, Monoclonal Antibodies: Principles and Practice , Academic 
Press, (1986) pp. 59-103). Immortalized cell lines are usually transformed mammalian 
cells, particularly myeloma cells of rodent, bovine and human origin. Usually, rat or 
mouse myeloma cell lines are employed. The hybridoma cells can be cultured in a 
suitable culture medium that preferably contains one or more substances that inhibit the 

25 growth or survival of the unfused, immortalized cells. For example, if the parental cells 
lack the enzyme hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), 
the culture medium for the hybridomas typically will include hypoxanthine, aminopterin, 
and thymidine ("HAT medium"), which substances prevent the growth of HGPRT- 
deficient cells. 

30 Preferred immortalized cell lines are those that fuse efficiently, support stable 

high level expression of antibody by the selected antibody-producing cells, and are 
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sensitive to a medium such as HAT medium. More preferred immortalized cell lines are 
murine myeloma lines, which can be obtained, for instance, from the Salic Institute Cell 
Distribution Center, San Diego, California and the American Type Culture Collection, 
Manassas, Virginia. Human myeloma and mouse-human heteromyeloma cell lines also 
5 have been described for the production of human monoclonal antibodies (Kozbor, J. 

Immunol., 133:3001 (1984); Brodeur et al., Monoclonal Antibody Production Techniques 
and Applications, Marcel Dekker, Inc., New York, (1987) pp. 51-63). 

The culture medium in which the hybridoma cells are cultured can then be 
assayed for the presence of monoclonal antibodies directed against the antigen. 

10 Preferably, the binding specificity of monoclonal antibodies produced by the hybridoma 
cells is determined by immunoprecipitation or by an in vitro binding assay, such as 
radioimmunoassay (RIA) or enzyme-linked immunoabsorbent assay (ELISA). Such 
techniques and assays are known in the art. The binding affinity of the monoclonal 
antibody can, for example, be determined by the Scatchard analysis of Munson and 

15 Pollard, Anal. Biochem. , 107:220 (1980). Preferably, antibodies having a high degree of 
specificity and a high binding affinity for the target antigen are isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by 
limiting dilution procedures and grown by standard methods. Suitable culture media for 
this purpose include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 

20 medium. Alternatively, the hybridoma cells can be grown in vivo as ascites in a 
mammal. 

The monoclonal antibodies secreted by the subclones can be isolated or purified 
from the culture medium or ascites fluid by conventional immunoglobulin purification 
procedures such as, for example, protein A-Sepharose, hydroxylapatite chromatography, 

25 gel electrophoresis, dialysis, or affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such 
as those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal 
antibodies of the invention can be readily isolated and sequenced using conventional 
procedures (e.g., by using oligonucleotide probes that are capable of binding specifically 

30 to genes encoding the heavy and light chains of murine antibodies). The hybridoma cells 
of the invention serve as a prefened source of such DNA. Once isolated, the DNA can 
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be placed into expression vectors, which are then transfected into host cells such as 
simian COS cells, Chinese hamster ovary (CHO) cells, or myeloma cells that do not 
otherwise produce immunoglobulin protein, to obtain the synthesis of monoclonal 
antibodies in the recombinant host cells. The DNA also can be modified, for example, by 
5 substituting the coding sequence for human heavy and light chain constant domains in 
. place of the homologous murine sequences (U.S. Patent No. 4,816,567; Morrison, Nature 
368 , 812-13 (1994)) or by covalendy joining to the immunoglobulin coding sequence all 
or part of the coding sequence for a non-immunoglobulin polypeptide. Such a non- 
immunoglobulin polypeptide can be substituted for the constant domains of an antibody 
10 of the invention, or can be substituted for the variable domains of one antigen-combining 
site of an antibody of the invention to create a chimeric bivalent antibody. 

4.133 HUMANIZED ANTIBODIES 

The antibodies directed against the protein antigens of the invention can further 

15 comprise humanized antibodies or human antibodies. These antibodies are suitable for 
administration to humans without engendering an immune response by the human against 
the administered immunoglobulin. Humanized forms of antibodies are chimeric 
immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab 1 , 
F(ab') 2 or other antigen-binding subsequences of antibodies) that are principally 

20 comprised of the sequence of a human immunoglobulin, and contain minimal sequence 
derived from a non-human immunoglobulin. Humanization can be performed following 
the method of Winter and co-workers (Jones et al., Nature, 321 :522-525 (1986); 
Riechmann et al., Nature. 332:323-327 (1988); Verhoeyen et al., Science. 239:1534-1536 
(1988)), by substituting rodent CDRs or CDR sequences for the corresponding sequences 

25 of a human antibody. (See also U.S. Patent No. 5,225,539). In some instances, Fv 

framework residues of the human immunoglobulin are replaced by corresponding non- 
human residues. Humanized antibodies can also comprise residues that are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, 
the humanized antibody will comprise substantially all of at least one, and typically two, 

30 variable domains, in which all or substantially all of the CDR regions correspond to those 
of a non-human immunoglobulin and all or substantially all of the framework regions are 
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those of a human immunoglobulin consensus sequence. The humanized antibody 
optimally also will comprise at least a portion of an immunoglobulin constant region 
(Fc), typically that of a human immunoglobulin (Jones et al., 1986; Riechmann et al., 
1988; and Presta, Curr. Op. Struct. Biol., 2:593-596 (1992)). 

5 

4.13.4 HUMAN ANTIBODIES 

Fully human antibodies relate to antibody molecules in which essentially the 
* entire sequences of both the light chain and the heavy chain, including the CDRs, arise 
from human genes. Such antibodies are termed "human antibodies", or 'fully human 

10 antibodies" herein. Human monoclonal antibodies can be prepared by the trioma 
technique; the human B-cell hybridoma technique (see Kozbor, et al., 1983 Immunol 
Today 4: 72) and the EB V hybridoma technique to produce human monoclonal 
antibodies (see Cole, et al., 1985 In: MONOCLONAL ANTIBODIES AND CANCER THERAPY, 
Alan R. Uss, Inc., pp. 77-96). Human monoclonal antibodies may be utilized in the 

15 practice of the present invention and may be produced by using human hybridomas (see 
Cote, et al., 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by transforming human 
B-cells with Epstein Barr Virus in vitro (see Cole, et al., 1985 In: MONOCLONAL 
Antibodies and Cancer Therapy, Alan R. liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 

20 including phage display libraries (Hoogenboom and Winter, J. Mol. Biol. , 227:381 

(1991); Marks et al., J. Mol. Biol., 222:581 (1991)). Similarly, human antibodies can be 
made by introducing human immunoglobulin loci into transgenic animals, e.g., mice in 
which the endogenous immunoglobulin genes have been partially or completely 
inactivated. Upon challenge, human antibody production is observed, which closely 

25 resembles that seen in humans in all respects, including gene rearrangement, assembly, 
and antibody repertoire. This approach is described, for example, in U.S. Patent Nos. 
5,545,807; 5,545,806; 5,569,825; 5,625,126; 5,633,425; 5,661,016, and in Marks et al. 
(Bio/Technology 10, 779-783 (1992)); Lonberg et al. (Nature 368 856-859 (1994)); 
Morrison (Nature 368, 812-13 (1994)); Fishwild et al, (Nature Biotechnology 14, 845-51 

30 (1996)); Neuberger (Nature Biotechnology 14, 826 (1996)); and Lonberg and Huszar 
(Intern. Rev. Immunol. 13 65-93 (1995)). 
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Human antibodies may additionally be produced using transgenic nonhuman 
animals that are modified so as to produce fully human antibodies rather than the 
animal's endogenous antibodies in response to challenge by an antigen. (See PCT 
publication WO94/02602). The endogenous genes encoding the heavy and light 
5 immunoglobulin chains in the nonhuman host have been incapacitated, and active loci 
encoding human heavy and light chain immunoglobulins arc inserted into the host's 
genome. The human genes are incorporated, for example, using yeast artificial 
chromosomes containing the requisite human DNA segments. An animal which provides 
all the desired modifications is then obtained as progeny by crossbreeding intermediate 

10 transgenic animals containing fewer than the full complement of the modifications. The 
preferred embodiment of such a nonhuman animal is a mouse, and is termed the 
Xenomouse™ as disclosed in PCT publications WO 96/33735 and WO 96/34096. This 
animal produces B cells that secrete fully human immunoglobulins. The antibodies can 
be obtained directly from the animal after immunization with an immunogen of interest, 

15 as, for example, a preparation of a polyclonal antibody, or alternatively from 

immortalized B cells derived from the animal, such as hybridomas producing monoclonal 
antibodies. Additionally, the genes encoding the immunoglobulins with human variable 
regions can be recovered and expressed to obtain the antibodies directly, or can be further 
modified to obtain analogs of antibodies such as, for example, single chain Fv molecules. 

20 An example of a method of producing a nonhuman host, exemplified as a mouse, 

lacking expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. 
Patent No. 5,939,598. It can be obtained by a method including deleting the J segment 
genes from at least one endogenous heavy chain locus in an embryonic stem cell to 
prevent rearrangement of the locus and to prevent formation of a transcript of a 

25 rearranged immunoglobulin heavy chain locus, the deletion being effected by a targeting 
vector containing a gene encoding a selectable marker, and producing from the 
embryonic stem cell a transgenic mouse whose somatic and germ cells contain the gene 
encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is 

30 disclosed in U.S. Patent No. 5,916,771. It includes introducing an expression vector that 
contains a nucleotide sequence encoding a heavy chain into one mammalian host cell in 

98 



WO 02/081731 



PCTAJS02/01222 



culture, introducing an expression vector containing a nucleotide sequence encoding a 
light chain into another mammalian host cell, and fusing the two cells to fonn a hybrid 
cell. The hybrid cell expresses an antibody containing the heavy chain and the light 
chain. 

5 In a further improvement on this procedure, a method for identifying a clinically 

relevant epitope on an immunogen, and a correlative method for selecting an antibody 
that binds immunospecifically to the relevant epitope with high affinity, are disclosed in 
PCT publication WO 99/53049. 

10 4.13.5 FAB FRAGMENTS AND SINGLE CHAIN ANTIBODIES 

According to the invention, techniques can be adapted for the production of 
single-chain antibodies specific to an antigenic protein of the invention (see e.g., U.S. 
Patent No. 4,946,778). In addition, methods can be adapted for the construction of Fab 
expression libraries (see e.g., Huse, et aL, 1989 Science 246: 1275-1281) to allow rapid 

15 and effective identification of monoclonal Fab fragments with the desired specificity for a 
protein or derivatives, fragments, analogs or homologs thereof. Antibody fragments that 
contain the idiotypes to a protein antigen may be produced by techniques known in the 
art including, but not limited to: (i) an F^ 2 fragment produced by pepsin digestion of an 
antibody molecule; (ii) an F ab fragment generated by reducing the disulfide bridges of an 

20 F ( ab*)2 fragment; (iii) an Fab fragment generated by the treatment of the antibody molecule 
with papain and a reducing agent and (iv) F v fragments. 

4.13.6 BISPECBF1C ANTIBODIES 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies 
25 that have binding specificities for at least two different antigens. In the present case, one 
of the binding specificities is for an antigenic protein of the invention. The second 
binding target is any other antigen, and advantageously is a cell-surface protein or 
receptor or receptor subunit 

Methods for making bispecific antibodies are known in the art. Traditionally, the 
30 recombinant production of bispecific antibodies is based on the co-expression of two 
immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have 
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different specificities (Milstein and Cuello, Nature, 305:537-539 (1983)). Because of the 
random assortment of immunoglobulin heavy and light chains, these hybridomas 
(quadromas) produce a potential mixture of ten different antibody molecules, of which 
only one has the correct bispecific structure. The purification of the correct molecule is 
5 usually accomplished by affinity chromatography steps. Similar procedures are disclosed 
in WO 93/08829, published 13 May 1993, and in Traunecker et al y 1991 EMBO 
10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody- 
antigen combining sites) can be fused to immunoglobulin constant domain sequences. 

10 The fusion preferably is with an immunoglobulin heavy-chain constant domain, 

comprising at least part of the hinge, CHE, and CH3 regions. It is preferred to have the 
first heavy-chain constant region (CHI) containing the site necessary for light-chain 
binding present in at least one of the fusions. DNAs encoding the immunoglobulin 
heavy-chain fusions and, if desired, the immunoglobulin light chain, are inserted into 

15 separate expression vectors, and are co-transfected into a suitable host organism. For 

further details of generating bispecific antibodies see, for example, Suresh et al., Methods 
inEnzvmology , 121:210 (1986). 

According to another approach described in WO 96/2701 1, the interface between 
a pair of antibody molecules can be engineered to maximize the percentage of 

20 heterodimers that are recovered from recombinant cell culture. Hie preferred interface 
comprises at least a part of the CH3 region of an antibody constant domain. In this 
method, one or more small amino acid side chains from the interface of the first antibody 
molecule are replaced with larger side chains (e.g. tyrosine or tryptophan). 
Compensatory "cavities" of identical or similar size to the large side chain(s) are created 

25 on the interface of the second antibody molecule by replacing large amino acid side 
chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as 
homodimers. 

Bispecific antibodies can be prepared as full length antibodies or antibody 
30 fragments (e.g. F(ab') 2 bispecific antibodies). Techniques for generating bispecific 

antibodies from antibody fragments have been described in the literature. For example, 
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bispecific antibodies can be prepared using chemical linkage. Brennan et al., Science 
229:81 (1985) describe a procedure wherein intact antibodies are proteolytically cleaved 
to generate F(ab*>2 fragments. These fragments are reduced in the presence of the dithiol 
complexing agent sodium arsenite to stabilize vicinal dithiols and prevent intennolecular 
5 disulfide formation. The Fab' fragments generated are then converted to 

thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB derivatives is then 
reconverted to the Fab'-thiol by reduction with mercaptoethylamine and is mixed with an 
equimolar amount of the other Fab'-TNB derivative to form the bispecific antibody. The 
bispecific antibodies produced can be used as agents for the selective immobilization of 
10 enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and 
chemically coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 
175:217-225 (1992) describe the production of a fully humanized bispecific antibody 
F(ab')2 molecule. Each Fab' fragment was separately secreted from E. coli and subjected 

15 to directed chemical coupling in vitro to form the bispecific antibody. The bispecific 
antibody thus formed was able to bind to cells overexpressing the ErbB2 receptor and 
normal human T cells, as well as trigger the lytic activity of human cytotoxic 
lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments 

20 directly from recombinant cell culture have also been described. For example, bispecific 
antibodies have been produced using leucine zippers. Kostelny et al., J. Immunol. 
148(5):1547-1553 (1992). The leucine zipper peptides from the Fos and Jun proteins 
were linked to the Fab' portions of two different antibodies by gene fusion. The antibody 
homodimers were reduced at the hinge region to form monomers and then re-oxidized to 

25 form the antibody heterodimers. This method can also be utilized for the production of 
antibody homodimers. The "diabody" technology described by Hollinger et al., Proc. 
Natl. Acad. Sci. USA 90:6444-6448 (1993) has provided an alternative mechanism for 
making bispecific antibody fragments. The fragments comprise a heavy-chain variable 
domain (V H ) connected to a light-chain variable domain (Vl) by a linker which is too 

30 short to allow pairing between the two domains on the same chain. Accordingly, the Vh 
and Vl domains of one fragment are forced to pair with the complementary Vl and V H 
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domains of another fragment, thereby forming two antigen-binding sites. Another 
strategy for making bispecific antibody fragments by the use of single-chain Fv (sFv) 
dimers has also been reported. See, Gruber et al., J. Immunol. 152:5368 (1994). 
Antibodies with more than two valencies are contemplated. For example, 
5 trispecific antibodies can be prepared. Tutt et al., J. Immunol. 147:60 (1991). 

Exemplary bispecific antibodies can bind to two different epitopes, at least one of 
which originates in the protein antigen of the invention. Alternatively, an anti-antigenic 
arm of an immunoglobulin molecule can be combined with an arm which binds to a 
triggering molecule on a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, 

10 CD28, or B7), or Fc receptors for IgG (FcyR), such as FcyRI (CD64), FcyRH (CD32) and 
FcyRDI (CD16) so as to focus cellular defense mechanisms to the cell expressing the 
particular antigen. Bispecific antibodies can also be used to direct cytotoxic agents to 
cells which express a particular antigen. These antibodies possess an antigen-binding 
arm and an arm which binds a cytotoxic agent or a radionuclide chelator, such as 

15 EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of interest binds the 
protein antigen described herein and further binds tissue factor (TF). 

4.13.7 HETEROCONJUGATE ANTIBODIES 

Heteroconjugate antibodies are also within the scope of the present invention. 

20 Heteroconjugate antibodies are composed of two covalently joined antibodies. Such 
antibodies have, for example, been proposed to target immune system cells to unwanted 
cells (U.S. Patent No. 4,676,980), and for treatment of HIV infection (WO 91/00360; 
WO 92/200373; EP 03089). It is contemplated that the antibodies can be prepared in 
vitro using known methods in synthetic protein chemistry, including those involving 

25 crosslinking agents. For example, immunotoxins can be constructed using a disulfide 
exchange reaction or by forming a thioether bond. Examples of suitable reagents for this 
purpose include iminothiolate and methyl-4-mercaptobutyrimidate and those disclosed, 
for example, in U.S. Patent No. 4,676,980. 

30 4.13.8 EFFECTOR FUNCTION ENGINEERING 



102 



WO 02/081731 



PCTAJS02/01222 



It can be desirable to modify the antibody of the invention with respect to effector 
function, so as to enhance, e.g., the effectiveness of the antibody in treating cancer. For 
example, cysteine residue(s) can be introduced into the Fc region, thereby allowing 
interchain disulfide bond formation in this region. The homodimeric antibody thus 
5 generated can have improved internalization capability and/or increased complement- 
mediated cell killing and antibody-dependent cellular cytotoxicity (ADCC). See Caron 
et aL, J. Exp Med., 176: 1191-1195 (1992) and Shopes, J. Immunol., 148: 2918-2922 
(1992). Homodimeric antibodies with enhanced anti-tumor activity can also be prepared 
using heterobifunctional cross-linkers as described in Wolff et al. Cancer Research, 53: 
10 2560-2565 (1993). Alternatively, an antibody can be engineered that has dual Fc 

regions and can thereby have enhanced complement lysis and ADCC capabilities. See 
Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

4.13.9 BMMUNOCONJUGATES 

15 The invention also pertains to immunoconjugates comprising an antibody 

conjugated to a cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an 
enzymatically active toxin of bacterial, fungal, plant, or animal origin, or fragments 
thereof), or a radioactive isotope (i.e., a radioconjugate). 

Chemotherapeutic agents useful in the generation of such immunoconjugates have 

20 been described above. Enzymatically active toxins and fragments thereof that can be 
used include diphtheria A chain, nonbinding active fragments of diphtheria toxin, 
exotoxin A chain (from Pseudomonas aeruginosa), ricin A chain, abrin A chain, 
modeccin A chain, alpha-sarcin, Aleurites f ordii proteins, dianthin proteins, Phytolaca 
americana proteins (PAPI, PAPII, and PAP-S), momordica charantia inhibitor, curcin, 

25 crotin, sapaonaria officinalis inhibitor, gelonin, mitogellin, restrictocin, phenomycih, 
enomycin, and the tricothecenes. A variety of radionuclides are available for the 
production of radioconjugated antibodies. Examples include 212 Bi, l31 1, 131 In, 90 Y, and 
186 Re. 

Conjugates of the antibody and cytotoxic agent are made using a variety of 
30 Afunctional protein-coupling agents such as N-succinimidyl-3-(2-pyridyIdithiol) 

propionate (SPDP), iminothiolane (TT), Afunctional derivatives of imidoesters (such as 
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dimethyl adipimidate HCL), active esters (such as disuccinimidyl suberate), aldehydes 
(such as glutareldehyde), bis-azido compounds (such as bis (p-azidobenzoyl) 
hexanediamine), bis-diazonium derivatives (such as bis-(p-diazoniumbenzoyl)- 
ethylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), and bis-active 
5 fluorine compounds (such asl,5-difluoro-2,4-dimtrobenzene). For example, a ricin 
immunotoxin can be prepared as described in Vitetta et al., Science, 238: 1098 (1987). 
Carbon- 14-labeled l-isothiocyanatobenzyl-3-methyldiethylene triaminepentaacetic acid 
(MX-DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the 
antibody. See W094/1 1026. 
10 In another embodiment, the antibody can be conjugated to a "receptor" (such 

streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate 
is administered to the patient, followed by removal of unbound conjugate from the 
circulation using a clearing agent and then administration of a "ligand" (e.g., avidin) that 
is in turn conjugated to a cytotoxic agent. 

15 

4.14 COMPUTER READABLE SEQUENCES 
In one application of this embodiment, a nucleotide sequence of the present 
invention can be recorded on computer readable media. As used herein, "computer 

20 readable media" refers to any medium which can be read and accessed directly by a 
computer. Such media include, but are not limited to: magnetic storage media, such as 
floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as 
CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these 
categories such as magnetic/optical storage media. A skilled artisan can readily 

25 appreciate how any of the presently known computer readable mediums can be used to 
create a manufacture comprising computer readable medium having recorded thereon a 
nucleotide sequence of the present invention. As used herein, "recorded" refers to a 
process for storing information on computer readable medium. A skilled artisan can 
readily adopt any of the presently known methods for recording information on computer 

30 readable medium to generate manufactures comprising the nucleotide sequence 
information of the present invention. 
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A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means 
chosen to access the stored information. In addition, a variety of data processor programs 
5 and formats can be used to store the nucleotide sequence information of the present 
invention on computer readable medium. The sequence information can be represented 
in a word processing text file, formatted in commercially-available software such as 
WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a 
database application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can 

10 readily adapt any number of data processor structuring formats (e.g. text file or database) 
in order to obtain computer readable medium having recorded thereon the nucleotide 
sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ED NOs: 1 - 438 or a 
representative fragment thereof; or a nucleotide sequence at least 95% identical to any of 

15 the nucleotide sequences of SEQ ID NOs: 1 - 438 in computer readable form, a skilled 
artisan can routinely access the sequence information for a variety of purposes. 
Computer software is publicly available which allows a skilled artisan to access sequence 
information provided in a computer readable medium. The examples which follow 
demonstrate how software which implements the BLAST (Altschul et al., J. Mol. Biol. 

20 215:403-410 (1990)) and BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) 
search algorithms on a Sybase system is used to identify open reading frames (ORFs) 
within a nucleic acid sequence. Such ORFs may be protein encoding fragments and may 
be useful in producing commercially important proteins such as enzymes used in 
fermentation reactions and in the production of commercially useful metabolites. 

25 As used herein, "a computer-based system" refers to the hardware means, 

software means, and data storage means used to analyze the nucleotide sequence 
information of the present invention. The minimum hardware means of the 
computer-based systems of the present invention comprises a central processing unit 
(CPU), input means, output means, and data storage means. A skilled artisan can readily 

30 appreciate that any one of the currently available computer-based systems are suitable for 
use in the present invention. As stated above, the computer-based systems of the present 
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invention comprise a data storage means having stored therein a nucleotide sequence of 
the present invention and the necessary hardware means and software means for 
supporting and implementing a search means. As used herein, "data storage means" 
refers to memory which can store nucleotide sequence information of the present 
5 invention, or a memory access means which can access manufactures having recorded 
thereon the nucleotide sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which are 
implemented on the computer-based system to compare a target sequence or target 
structural motif with the sequence information stored within the data storage means. 

10 Search means are used to identify fragments or regions of a known sequence which 
match a particular target sequence or target motif. A variety of known algorithms are 
disclosed publicly and a variety of commercially available software for conducting search 
means are and can be used in the computer-based systems of the present invention. 
Examples of such software includes, but is not limited to, Smith- Waterman, MacPattem 

15 (EMBL), BLASTN and BLASTA (NPOLYPEPTDDEIA). A skilled artisan can readily 
recognize that any one of the available algorithms or implementing software packages for 
conducting homology searches can be adapted for use in the present computer-based 
systems. As used herein, a "target sequence" can be any nucleic acid or amino acid 
sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 

20 readily recognize that the longer a target sequence is, the less likely a target sequence will 
be present as a random occurrence in the database. The most preferred sequence length 
of a target sequence is from about 10 to 300 amino acids, more preferably from about 30 
to 100 nucleotide residues. However, it is well recognized that searches for 
commercially important fragments, such as sequence fragments involved in gene 

25 expression and protein processing, may be of shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any 
rationally selected sequence or combination of sequences in which the sequence(s) are 
chosen based on a three-dimensional configuration which is formed upon the folding of 
the target motif. There are a variety of target motifs known in the art. Protein target 

30 motifs include, but are not limited to, enzyme active sites and signal sequences. Nucleic 
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acid target motifs include, but are not limited to, promoter sequences, hairpin structures 
and inducible expression elements (protein binding sequences). 

4.15 TRIPLE HELIX FORMATION 

5 In addition, the fragments of the present invention, as broadly described, can be 

used to control gene expression through triple helix formation or antisense DNA or RNA, 
both of which methods are based on the binding of a polynucleotide sequence to DNA or 
RNA. Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in 
length and are designed to be complementary to a region of the gene involved in 

10 transcription (triple helix - see Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., 
Science 15241:456 (1988); and Dervan et al., Science 251:1360 (1991)) or to the mRNA 
itself (antisense - Olmno, J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as 
Antisense Inhibitors of Gene Expression, CRC Press, Boca Raton, FL (1988)). Triple 
helix-formation optimally results in a shut-off of RNA transcription from DNA, while 

15 antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide. 
Both techniques have been demonstrated to be effective in model systems. Information 
contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 

20 4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or 
expression of one of the ORFs of the present invention, or homolog thereof, in a test 
sample, using a nucleic acid probe or antibodies of the present invention, optionally 
conjugated or otherwise associated with a suitable label. 

25 Li general, methods for detecting a polynucleotide of the invention can comprise 

contacting a sample with a compound that binds to and forms a complex with the 
polynucleotide for a period sufficient to form the complex, and detecting the complex, so 
that if a complex is detected, a polynucleotide of the invention is detected in the sample. 
Such methods can also comprise contacting a sample under stringent hybridization 

30 conditions with nucleic acid primers that anneal to a polynucleotide of the invention 
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under such conditions, and amplifying annealed polynucleotides, so that if a 
polynucleotide is amplified, a polynucleotide of the invention is detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the 
5 polypeptide for a period sufficient to form the complex, and detecting the complex, so 
that if a complex is detected, a polypeptide of the invention is detected in the sample. 

In detail, such methods comprise incubating a test sample with one or more of the 
antibodies or one or more of the nucleic acid probes of the present invention and assaying 
for binding of the nucleic acid probes or antibodies to components within the test sample. 

10 Conditions for incubating a nucleic acid probe or antibody with a test sample 

vary. Incubation conditions depend on the format employed in the assay, the detection 
methods employed, and the type and nature of the nucleic acid probe or antibody used in 
the assay. One skilled in the art will recognize that any one of the commonly available 
hybridization, amplification or immunological assay formats can readily be adapted to 

15 employ the nucleic acid probes or antibodies of the present invention. Examples of such 
assays can be found in Chard, T., An Introduction to Radioimmunoassay and Related 
Techniques, Elsevier Science Publishers, Amsterdam, The Netherlands (1986); Bullock, 
G.R. et al., Techniques in Immunocytochemistry, Academic Press, Orlando, FL Vol. 1 
(1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice and Theory of immunoassays: 

20 Laboratory Techniques in Biochemistry and Molecular Biology, Elsevier Science 
Publishers, Amsterdam, The Netherlands (1985). The test samples of the present 
invention include cells, protein or membrane extracts of cells, or biological fluids such as 
sputum, blood, serum, plasma, or urine. The test sample used in the above-described 
method will vary based on the assay format, nature of the detection method and the 

25 tissues, cells or extracts used as the sample to be assayed. Methods for preparing protein 
extracts or membrane extracts of cells are well known in the art and can be readily be 
adapted in order to obtain a sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain 
the necessary reagents to carry out the assays of the present invention. Specifically, the 

30 invention provides a compartment kit to receive, in close confinement, one or more 
containers which comprises: (a) a first container comprising one of the probes or 
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antibodies of the present invention; and (b) one or more other containers comprising one 
or more of the following: wash reagents, reagents capable of detecting presence of a 
bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in 
5 separate containers. Such containers include small glass containers, plastic containers or 
strips of plastic or paper. Such containers allows one to efficiently transfer reagents from 
one compartment to another compartment such that the samples and reagents are not 
cross-contaminated, and the agents or solutions of each container can be added in a 
quantitative fashion from one compartment to another. Such containers will include a 

10 container which will accept the test sample, a container which contains the antibodies 
used in the assay, containers which contain wash reagents (such as phosphate buffered 
saline, Tris-buffers, etc.), and containers which contain the reagents used to detect the 
bound antibody or probe. Types of detection reagents include labeled nucleic acid probes, 
labeled secondary antibodies, or in the alternative, if the primary antibody is labeled, the 

15 enzymatic, or antibody binding reagents which are capable of reacting with the labeled 
antibody. One skilled in the art will readily recognize that the disclosed probes and 
antibodies of the present invention can be readily incorporated into one of the established 
kit formats which are well known in the art. 

20 4.17 MEDICAL IMAGING 

The novel polypeptides and binding partners of the invention are useful in 
medical imaging of sites expressing the molecules of the invention (e.g., where the 
polypeptide of the invention is involved in the immune response, for imaging sites of 
inflammation or infection). See, e.g., Kunkel et aL, U.S. Pat. NO. 5,413,778. Such 
25 methods involve chemical attachment of a labeling or imaging agent, administration of 
the labeled polypeptide to a subject in a pharmaceutically acceptable carrier, and imaging 
the labeled polypeptide in vivo at the target site. 

4.18 SCREENING ASSAYS 
30 Using the isolated proteins and polynucleotides of the invention, the present 

invention further provides methods of obtaining and identifying agents which bind to a 
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polypeptide encoded by an ORF corresponding to any of the nucleotide sequences set 
forth in SEQ ID NOs: 1 - 438, or bind to a specific domain of the polypeptide encoded by 
the nucleic acid. In detail, said method comprises the steps of: 

(a) contacting an agent with an isolated protein encoded by an ORF of the 
5 present invention, or nucleic acid of the invention; and 

(b) determining whether the agent binds to said protein or said nucleic acid. 
In general, therefore, such methods for identifying compounds that bind to a 

polynucleotide of the invention can comprise contacting a compound with a 
polynucleotide of the invention for a time sufficient to form a polynucleotide/compound 

10 complex, and detecting the complex, so that if a polynucleotide/compound complex is 
detected, a compound that binds to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind 
to a polypeptide of the invention can comprise contacting a compound with a polypeptide 
of the invention for a time sufficient to form a polypeptide/compound complex, and 

15 detecting the complex, so that if a polypeptide/compound complex is detected, a 
compound that binds to a polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention 
can also comprise contacting a compound with a polypeptide of the invention in a cell for 
a time sufficient to form a polypeptide/compound complex, wherein the complex drives 

20 expression of a receptor gene sequence in the cell, and detecting the complex by 

detecting reporter gene sequence expression, so that if a polypeptide/compound complex 
is detected, a compound that binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate 
the activity of a polypeptide of the invention (that is, increase or decrease its activity, 

25 relative to activity observed in the absence of the compound). Alternatively, compounds 
identified via such methods can include compounds which modulate the expression of a 
polynucleotide of the invention (that is, increase or decrease expression relative to 
expression levels observed in the absence of the compound). Compounds, such as 
compounds identified via the methods of the invention, can be tested using standard 

30 assays well known to those of skill in the art for their ability to modulate 
activity/expression. 
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The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be 
selected and screened at random or rationally selected or designed using protein modeling 
techniques. 

5 For random screening, agents such as peptides, carbohydrates, pharmaceutical 

agents and the like are selected at random and are assayed for their ability to bind to the 
protein encoded by the ORF of the present invention. Alternatively, agents may be 
rationally selected or designed. As used herein, an agent is said to be "rationally selected 
or designed" when the agent is chosen based on the configuration of the particular 

10 protein. For example, one skilled in the art can readily adapt currendy available 

procedures to generate peptides, pharmaceutical agents and the like, capable of binding to 
a specific peptide sequence, in order to generate rationally designed antipeptide peptides, 
for example see Hurby et al., Application of Synthetic Peptides: Antisense Peptides," In 
Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and 

15 Kaspczak et al., Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 
In addition to the foregoing, one class of agents of the present invention, as 
broadly described, can be used to control gene expression through binding to one of the 
ORFs or EMFs of the present invention. As described above, such agents can be 
randomly screened or rationally designed/selected. Targeting the ORF or EMF allows a 

20 skilled artisan to design sequence specific or element specific agents, modulating the 
expression of either a single ORF or multiple ORFs which rely on the same EMF for 
expression control. One class of DNA binding agents are agents which contain base 
residues which hybridize or form a triple helix formation by binding to DNA or RNA. 
Such agents can be based on the classic phosphodi ester, ribonucleic acid backbone, or 

25 can be a variety of sulfhydryl or polymeric derivatives which have base attachment 
capacity. 

Agents suitable for use in these methods preferably contain 20 to 40 bases and are 
designed to be complementary to a region of the gene involved in transcription (triple 
helix - see Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 241:456 
30 (1988); and Dervan et al., Science 251:1360 (1991)) or to the mRNA itself (antisense - 
Okano, J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of 
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Gene Expression, CRC Press, Boca Raton, FL (1988)). Triple helix-formation optimally 
results in a shut-off of RNA transcription from DNA, while antisense RNA hybridization 
blocks translation of an mRNA molecule into polypeptide. Both techniques have been 
demonstrated to be effective in model systems. Information contained in the sequences 
5 of the present invention is necessary for the design of an antisense or triple helix 
oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present 
invention can be used as a diagnostic agent. Agents which bind to a protein encoded by 
one of the ORFs of the present invention can be formulated using known techniques to 
10 generate a pharmaceutical composition. 

4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific 
nucleic acid hybridization probes capable of hybridizing with naturally occurring 

15 nucleotide sequences. The hybridization probes of the subject invention may be derived 
from any of the nucleotide sequences SEQ ID NOs: 1 - 438. Because the corresponding 
gene is only expressed in a limited number of tissues, a hybridization probe derived from 
of any of the nucleotide sequences SEQ ID NOs: 1 - 438 can be used as an indicator of 
the presence of RNA of cell type of such a tissue in a sample. 

20 Any suitable hybridization technique can be employed, such as, for example, in 

situ hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965,188 
provides additional uses for oligonucleotides based upon the nucleotide sequences. Such 
probes used in PCR may be of recombinant origin, may be chemically synthesized, or a 
mixture of both. The probe will comprise a discrete nucleotide sequence for the detection 

25 of identical sequences or a degenerate pool of possible sequences for identification of 
closely related genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include 
the cloning of nucleic acid sequences into vectors for the production of mRNA probes. 
Such vectors are known in the art and are commercially available and may be used to 

30 synthesize RNA probes in vitro by means of the addition of the appropriate RNA 
polymerase as T7 or SP6 RNA polymerase and the appropriate radioactively labeled 
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nucleotides. The nucleotide sequences may be used to construct hybridization probes for 
mapping their respective genomic sequences. The nucleotide sequence provided herein 
may be mapped to a chromosome or specific regions of a chromosome using well known 
genetic and/or chromosomal mapping techniques. These techniques include in situ 

5 hybridization, linkage analysis against known chromosomal markers, hybridization 
screening with libraries or flow-sorted chromosomal preparations specific to known 
chromosomes, and the like. The technique of fluorescent in situ hybridization of 
chromosome spreads has been described, among other places, in Verma et al (1988) 
Human Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

10 Fluorescent in situ hybridization of chromosomal preparations and other physical 

chromosome mapping techniques may be correlated with additional genetic map data. 
Examples of genetic map data can be found in the 1994 Genome Issue of Science 
(265:1981f). Correlation between the location of a nucleic acid on a physical 
chromosomal map and a specific disease (or predisposition to a specific disease) may 

15 help delimit the region of DNA associated with that genetic disease. The nucleotide 
sequences of the subject invention may be used to detect differences in gene sequences 
between normal, carrier or affected individuals. 

4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
20 example, directly synthesizing the oligonucleotide by chemical means, as is commonly 

practiced using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to 

those of skill in the art using any suitable support such as glass, polystyrene or Teflon. One 

strategy is to precisely spot oligonucleotides synthesized by standard synthesizers. 
25 Immobilization can be achieved using passive adsorption (Inouye & Hondo, (1 990) J. Clin. 

Microbiol. 28(6) 1469-72); using UV light (Nagata et dL, 1985; Dahlen et aL, 1987; 

Morrissey & Collins, (1989) Mol. Cell Probes 3(2) 189-207) or by covalent binding of base 

modified DNA (Keller et al. t 1988; 1989); all references being specifically incorporated 

herein. 

30 Another strategy that may be employed is the use of the strong biotin-streptavidin 

interaction as a linker. For example, Broude et aL (1994) Proc. Natl. Acad. Sci. USA 91(8) 
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3072-6, describe the use of biotinylated probes, although these are duplex probes, that are 
immobilized on streptavidin-coated magnetic beads. Streptavidin-coated beads may be 
purchased from Dynal, Oslo. Of course, this same linking chemistry is applicable to coating 
any surface with streptavidin. Biotinylated probes may be purchased from various sources, 
5 such as, e.g., Operon Technologies (Alameda, CA). 

Nunc Laboratories (Naperville, IL) is also selling suitable material that could be 
used. Nunc Laboratories have developed a method by which DNA can be covalendy bound 
to the microwell surface teimed Covalink NH. CovaLink NH is a polystyrene surface 
grafted with secondary amino groups (>NH) that serve as bridge-heads for further covalent 
10 coupling. CovaLink Modules may be purchased from Nunc Laboratories. DNA molecules 
may be bound to CovaLink exclusively at the 5*-end by a phosphoramidate bond, allowing 
immobilization of more than 1 pmol of DNA (Rasmussen et al., (1991) Anal. Biochem. 
198(1) 138-42). 

The use of CovaLink NH strips for covalent binding of DNA molecules at the 5'-end 
15 has been described (Rasmussen et al., (1991). In this technology, a phosphoramidate bond 
is employed (Chu et al., (1983) Nucleic Acids Res. 11(8) 6513-29). This is beneficial as 
immobilization using only a single covalent bond is preferred. The phosphoramidate bond 
joins the DNA to the CovaLink NH secondary amino groups that are positioned at the end 
of spacer arms covalently grafted onto the polystyrene surface through a 2 nm long spacer 
20 arm. To link an oligonucleotide to CovaLink NH via an phosphoramidate bond, the 

oligonucleotide terminus must have a 5'-end phosphate group. It is, perhaps, even possible 
for biotin to be covalently bound to Covalink and then streptavidin used to bind the probes. 

More specifically, the linkage method includes dissolving DNA in water (7.5 ng/ul) 
and denaturing for 10 min. at 95°C and cooling on ice for 10 min. Ice-cold 0.1 M 
25 1-methylimidazole, pH 7.0 (l-Melm?), is then added to a final concentration of 10 mM 
l-MeJm 7 . A ss DNA solution is then dispensed into Covalink NH strips (75 ul/well) 
standing on ice. 

Carbodiimide 0.2 M l-ethyl-3-(3-dimethylaminopiopy]Kaibodiimide (EDC), 
dissolved in 10 mM 1-Melm 7 , is made fresh and 25 ul added per well. The strips are 
30 incubated for 5 hours at 50°C. After incubation the strips are washed using, e.g., 

Nunc-Immuno Wash; first the wells are washed 3 times, then they are soaked with washing 
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solution for 5 min., and finally they are washed 3 times (where in the washing solution is 0.4 
N NaOH, 0.25% SDS heated to 50°C). 

It is contemplated that a further suitable method for use with the present invention is 
that described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated 
5 herein by reference. This method of preparing an oligonucleotide bound to a support 
involves attaching a nucleoside 3-reagent through the phosphate group by a covalent 
phosphodiester link to aliphatic hydroxy! groups carried by the support The 
oligonucleotide is then synthesized on the supported nucleoside and protecting groups 
removed from the synthetic oligonucleotide chain under standard conditions that do not 

10 cleave the oligonucleotide from the support. Suitable reagents include nucleoside 
phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA 
probe arrays may be employed For example, addressable laser-activated photodeprotection 
may be employed in the chemical synthesis of oligonucleotides direcdy on a glass surface, 

15 as described by Fodor et aL (1991) Science 251(4995) 767-73, incorporated herein by 

reference. Probes may also be immobilized on nylon supports as described by Van Ness et 
aL (1991) Nucleic Acids Res. 19(12) 3345-50; or linked to Teflon using the method of 
Duncan & Cavalier (1988) Anal. Biochem. 169(1) 104-8; all references being specifically 
incorporated herein. 

20 To link an oligonucleotide to a nylon support, as described by Van Ness et aL 

(1991), requires activation of the nylon surface via alkylation and selective activation of the 
5-amine of oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
light-generated synthesis described by Pease et a!., (1994) PNAS USA 91(1 1) 5022-6, 

25 incorporated herein by reference). These authors used current photolithographic techniques 
to generate arrays of immobilized oligonucleotide probes (DNA chips). These methods, in 
which light is used to direct the synthesis of oligonucleotide probes in high-density, 
miniaturized arrays, utilize photolabile 5-protected A/-acyl-deoxynucleoside 
phosphoramidites, surface linker chemistry and versatile combinatorial synthesis strategies. 

30 A matrix of 256 spatially defined oligonucleotide probes may be generated in this manner. 
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421 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, 
genomic DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC 
inserts, and RNA, including mRNA without any amplification steps. For example, 
5 Sambrook et aL (1989) describes three protocols for the isolation of high molecular weight 
DNA from mammalian cells (p. 9.14-9.23). 

DNA fragments may be prepared as clones in M13, plasmid or lambda vectors 
and/or prepared direcdy from genomic DNA or cDNA by PCR or other amplification 
methods. Samples may be prepared or dispensed in multiwell plates. About 100-1000 ng of 

10 DNA samples may be prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those 
of skill in the art including, for example, using restriction enzymes as described at 9.24-9.28 
of Sambrook et ed. (1989), shearing by ultrasound and NaOH treatment. 

Low pressure shearing is also appropriate, as described by Schriefer et aL (1990) 

15 Nucleic Acids Res. 18(24) 7455-6, incorporated herein by reference). In this method, DNA 
samples are passed through a small French pressure cell at a variety of low to intermediate 
pressures. A lever device allows controlled application of low to intermediate pressures to 
the cell. The results of these studies indicate that low-pressure shearing is a useful 
alternative to sonic and enzymatic DNA fragmentation methods. 

20 One particularly suitable way for fragmenting DNA is contemplated to be that using 

the two base recognition endonuclease, CviJI, described by Fitzgerald et aL (1992) Nucleic 
Acids Res. 20(14) 3753-62. These authors described an approach for the rapid 
fragmentation and fractionation of DNA into particular sizes that they contemplated to be 
suitable for shotgun cloning and sequencing. 

25 The restriction endonuclease CviJI normally cleaves the recognition sequence 

PuGCPy between the G and C to leave blunt ends. Atypical reaction conditions, which alter 
the specificity of this enzyme (CviJI**), yield a quasi-random distribution of DNA 
fragments foim the small molecule pUC19 (2688 base pairs). Fitzgerald et aL (1992) 
quantitatively evaluated the randomness of this fragmentation strategy, using a CviJI** 

30 digest of pUC19 that was size fractionated by a rapid gel filtration method and direcdy 
ligated, without end repair, to a lac Z minus M13 cloning vector. Sequence analysis of 76 
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clones showed that CviJI** restricts pyGCPy and PuGCPu, in addition to PuGCPy sites, and 
that new sequence data is accumulated at a rate consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead 
5 of 2-5 ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or 
agarose gel electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or 
prepared, it is important to denature the DNA to give single stranded pieces available for 
hybridization. This is achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. 
10 The solution is then cooled quickly to 2°C to prevent renaturation of the DNA fragments 
before they are contacted with the chip. Phosphate groups must also be removed from 
genomic DNA by methods known in the art. 

4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon 

15 membrane. Spotting may be performed by using arrays of metal pins (the positions of which 
correspond to an array of wells in a microtiter plate) to repeated by transfer of about 20 nl of 
a DNA solution to a nylon membrane. By offset printing, a density of dots higher than the 
density of the wells is achieved One to 25 dots may be accommodated in 1 mm , 
depending on the type of label used. By avoiding spotting in some preselected number of 

20 rows and columns, separate subsets (subarrays) may be formed. Samples in one subarray 
may be the same genomic segment of DNA (or the same gene) from different individuals, or 
may be different, overlapped genomic clones. Each of the subarrays may represent replica 
spotting of the same samples. In one example, a selected gene segment may be amplified 
from 64 patients. For each patient, the amplified gene segment may be in one 96-well plate 

25 (all 96 wells containing the same sample). A plate for each of the 64 patients is prepared. By 
using a 96-pin device, all samples may be spotted on one 8 x 12 cm membrane. Subarrays 
may contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 
dot span may be 1 mm 2 and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, 

30 Illinois) which may be partitioned by physical spacers e.g. a plastic grid molded over the 

membrane, the grid being similar to the sort of membrane applied to the bottom of multiwell 
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plates, or hydrophobic strips. A fixed physical spacer is not preferred for imaging by 
exposure to flat phosphor-storage screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration 
of the present disclosure, one of skill in the art will appreciate that many other embodiments 

5 and variations may be made in the scope of the present invention. Accordingly, it is 

intended that the broader aspects of the present invention not be limited to the disclosure of 
the following examples. The present invention is not to be limited in scope by die 
exemplified embodiments which are intended as illustrations of single aspects of the 
invention, and compositions and methods which are functionally equivalent are within the 

10 scope of the invention. Indeed, numerous modifications and variations in the practice of the 
invention are expected to occur to those skilled in the art upon consideration of the present 
preferred embodiments. Consequently, the only limitations which should be placed upon 
the scope of the invention are those which appear in the appended claims. 

All references cited within the body of the instant specification are hereby 

1 5 incorporated by reference in their entirety. 

5.0 EXAMPLES 

5.1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 

A plurality of novel nucleic acids were obtained from cDNA libraries prepared from 
20 various human tissues and in some cases isolated from a genomic library derived from 
human chromosome using standard PCR, SBH sequence signature analysis and Sanger 
sequencing techniques. The inserts of the library were amplified with PCR using primers 
specific for the vector sequences which flank the inserts. Clones from cDNA libraries were 
spotted on nylon membrane filters and screened with oligonucleotide probes (e.g., 7-mers) 
25 to obtain signature sequences. Hie clones were clustered into groups of similar or identical 
sequences. Representative clones were selected for sequencing. 

In some cases, the 5' sequence of the amplified inserts was then deduced using a 
typical Sanger sequencing protocol. PCR products were purified and subjected to 
fluorescent dye terminator cycle sequencing. Single pass gel sequencing was done using a 
30 377 Applied Biosystems (ABI) sequencer to obtain the novel nucleic acid sequences. In 
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some cases RACE (Random Amplification of cDNA Bids) was pert onned to further extend 
the sequence in the 5' direction. 

5.2 EXAMPLE 2 
Novel Nucleic Acids 

5 The novel nucleic acids of the present invention of the invention were assembled 

from sequences that were obtained from a cDNA library by methods described in Example 
1 above, and in some cases sequences obtained from one or more public databases. The 
nucleic acids were assembled using an EST sequence as a seed. Then a recursive algorithm 
was used to extend the seed EST into an extended assemblage, by pulling additional 

10 sequences from different databases (i.e., Hyseq's database containing EST sequences, 
dbEST version 1 19, gb pri 1 19, and UniGene version 1 19) that belong to this assemblage. 
The algorithm terminated when there was no additional sequences from the above databases 
that would extend the assemblage. Inclusion of component sequences into the assemblage 
was based on a BLASTN hit to the extending assemblage with BLAST score greater than 

15 300 and percent identity greater than 95%. 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any 
frame shifts and incorrect stop codons were corrected by hand editing. During editing, the 
sequence was checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 

20 120, gb pri 120, UniGene version 120, Genpept release 120). Other computer programs 
which may have been used in the editing process were phredPhrap and Consed (University 
of Washington) and ed-ready, ed-ext and cg-zip-2 (Hyseq, Inc.). The full-length nucleotide 
and amino acid sequences, including splice variants resulting from these procedures are 
shown in the Sequence Listing as SEQ ID NOS: 1- 438. 

25 Table 1 shows the various tissue sources of SEQ ID NO: 1-438. 

The nearest neighbor results for polypeptides encoded by SEQ ID NO: 1-438 
were obtained by a BLASTP (version 2.0al 19MP-WashU) search against Genpept, 
Geneseq and SwissProt databases using BLAST algorithm. The nearest neighbor result 
showed the closest homologue with functional annotation for SEQ ID NO: 1-438. The 

30 translated amino acid sequences for which the nucleic acid sequence encodes are shown 
in the Sequence Listing. The homologues with identifiable functions for SEQ ID NO: 1- 
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438 are shown in Table 2 below.Using eMatrix software package (Stanford University, 
Stanfoid, CA) (Wu et al., J. Comp. Biol., Vol. 6 pp. 219-235 (1999) herein incorporated 
by reference), all the sequences were examined to determine whether they had 
identifiable signature regions. Table 3 shows the signature region found in the indicated 

5 polypeptide sequences, the description of the signature, the eMatrix p-value(s) and the 
position(s) of the signature within the polypeptide sequence. 

Using the Pfam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 
26(1) pp. 320-322 (1998) herein incorporated by reference) polypeptides encoded by 
SEQ ID NO: 1-438 (i.e. SEQ ID NO: 1-438) were examined for domains with homology 

10 to certain peptide domains. Table 4 shows the name of the domain found, the 

description, the product of all the e-value of similar domains found, the pFam score for 
the identified domain within the sequence, number of similar domains found, and the 
position of the domain in the SEQ ID NO: being inteirorgated. 

The GeneAtlas™ software package (Molecular Simulations Inc. (MSI), San 

15 Diego, CA) was used to predict the three-dimensional structure models for the 

polypeptides encoded by SEQ ID NO: 1-438 (i.e. SEQ ID NO: 1-438). Models were 
generated by (1) PSI-BLAST which is a multiple alignment sequence profile-based 
searching developed by Altschul et al, (Nucl. Acids. Res. 25, 3389-3408 (1997)), (2) 
High Throughput Modeling (HTM) (Molecular Simulations Inc. (MSI) San Diego, CA,) 

20 which is an automated sequence and structure searching procedure 

flittp://www.msi.comA . and (3) SeqFold™ which is a fold recognition method described 
by Fischer and Eisenberg (J. Mol. Biol. 209, 779-791 (1998)). This analysis was carried 
out, in part, by comparing the polypeptides of the invention with the known NMR 
(nuclear magnetic resonance) and x-ray crystal three-dimensional structures as templates. 

25 Table 5 shows, "PDB ID", the Protein DataBase (PDB) identifier given to template 
structure; "Chain ID", identifier of the subcomponent of the PDB template structure; 
"Compound Information", information of the PDB template structure and/or its 
subcomponents; "PDB Function Annotation" gives function of the PDB template as 
annotated by the PDB files (http:/www.rcsb.org/PDB/) : start and end amino acid position 

30 of the protein sequence aligned; PSI-BLAST score, the verify score, the SeqFold score, 
and the Potentials) of Mean Force (PMF). The verify score is produced by GeneAtlas™ 
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software (MSI), is based on Dr. Eisenberg's Profile-3D threading program developed in 
Dr. David Eisenberg's laboratory (US patent no. 5,436,850 and Luthy, Bowie, and 
Eisenberg, Nature, 356:83-85 (1992)) and a publication by R. Sanchez and A. Sali, Proc. 
Natl. Acad Sci. USA, 95:13597-12502. The verify score produced by GeneAtlas 
5 normalizes the verify score for proteins with different lengths so that a unified cutoff can 
be used to select good models as follows: 

Verify score (normalized) = (raw score - 1/2 high score)/(l/2 high score) 

10 The PFM score, produced by GeneAtlas™ software (MSI), is a composite scoring 

function that depends in part on the compactness of the model, sequence identity in the 
alignment used to build the model, pairwise and surface mean force potentials (MFP). As 
given in Table 5, a verify score between 0 to 1.0, with 1 being the best, represents a good 
model. Similarly, a PMF score between 0 to 1.0, with 1 being the best, represents a good 

15 model. A SeqFold™ score of more than 50 is considered significant. A good model may 
also be determined by one of skill in the art based all the information in Table 5 taken in 
totality. 

The nucleotide sequence within the sequences that codes for signal peptide 
sequences and their cleavage sites can be determined from using Neural Network SignalP 

20 Vl.l program (from Center for Biological Sequence Analysis, The Technical University 
of Denmark). The process for identifying piokaryotic and eukaryotic signal peptides and 
their cleavage sites are also disclosed by Henrik Nielson, Jacob Engelbrecht, Soren 
Brunak, and Gunnar von Heijne in the publication 44 Identification of prokaryotic and 
eukaryotic signal peptides and prediction of their cleavage sites" Protein Engineering, 

25 Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by reference. A maximum S score and 
a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 6 shows the position of the signal peptide in each of the 
polypeptides and the maximum score and mean score associated with that signal peptide. 
Table 7 correlates each of SEQ ID NO: 1-438 to a specific chromosomal location. 

30 Table 8 is a correlation table of the novel polynucleotide sequences SEQ ID NO: 

1^38, novel polypeptide sequences SEQ ID NO: 1-438, and their corresponding priority 
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nucleotide sequences in the priority application USSN 09/774,528, herein incorporated 
by reference in its entirety. 



122 



WO 02/081731 PCT/US02/01222 



Table 1 



Tissue Origin 



Rid/Tissue 
Source 



Library 
Name 



SEQ ID NO: 



adult brain 



GIBCO 



AB3001 



76-77 91 106-107 115 134 163-164 178 203 
232 255 265 276 279 322-323 



adult brain 



GIBCO 



ABD003 



16 19 24 77 
110 116 121 
142-143 151 
193 196 198 
220 223 229 
259 262 265 
317 321 324 
371 391-392 



80-81 85 89 
-123 125 130 
153 158-159 
200 208-209 
232-234 236 
267 274-276 
-325 327 337 
400 



-90 92 96 98 105 
-132 134-136 138 
163-164 184 191 
213-214 216 219- 
239 241 243 257- 
278 284 292 302 
-338 340 348 359 



adult brain 



Cl on tech 



ABR001 



1 18-19 35 80 98 125 136 153 185 200 209 
221 228-229 239 243 274-275 302 399-400 



adult brain 



Clontech 



ABR0065 



7-8 18 32 35 52 57 85 91 96 111 113 126 

131 135 138-139 142 148 153-154 181 188 

192 199 209-211 217 221 224 226 229 233 

235 238 243 248 273 283-284 286 292 316 

322 348 357 361 367 376 378 399 407 409 

417 428 



adult brain 



Clontech 



ABR008 



2 4 6-11 19- 
72-73 76 80- 
109 111-112 
135 138-139 
159 168-172 
189-190 194 
219 221-222 
243-244 248 
276 281-282 
304 315-317 
332 341 352- 
376-377 379- 
394 396-402 
433 



21 23-25 31 
81 85 88-90 
114-119 121 
144 146-150 
174-175 178 
196 198-201 
224 229-230 
253-256 260 
286-289 291 
319 321-322 
357 360 362 
380 383-384 
407-410 412 



35-37 39 
94-95 97 

-122 126- 
152-153 
180 182 
203 205- 
232-233 

-261 263 

-292 299 
324 326 
365 367 
387-389 

-413 419 



-41 45-46 

102-105 
131 134- 
156-157 
185-186 
210 217 
236-239 
265 273 
300 302 
329 331- 
368 370 
391-392 
425-426 



adult brain 



Clontech 



ABR011 



85 90 



adult brain 



BioChain 



ABR012 



148 213 



adult brain 



BioChain 



ABR013 



85 322 



adult brain 



Invitrogen 



ABR014 



9 23 85 146 200 233 282 321 330 



adult brain 



Invitrogen 



ABR015 



14 31 69 121 124 163 209 216 224 291 377 



adult brain 



Invitrogen 



ABR016 



92 136 219 279 



adult brain 



Invitrogen 



ABT004 



2 7-8 20-21 33 85 90 
121 123 129-131 138 
157-158 172 178 180 
230 232 234 239 308 
373 375 401 412 



-91 95 97 102-103 108 
139 143 146 151 153 
209-210 213 219 229- 
321 330 360 365 370- 



adipocytes 



Stratagene 



ADP001 



3-4 23 36 79 81 106 
147 151 154 158 179 
256-257 287 292 297 



-107 116 129 
181 192 196 
313 329 359 



133-134 
222 230 



adrenal gland 



Clontech 



ADR002 



2 25 27 
114 121- 
180 182 
244 246 
329 336 



33 57 76 85 
122 125 129 
198-199 201 
253-254 257 
352 403 



-86 88 96 98 
-130 134 147 
205 207-208 
261 276 280 



105-108 
164 178 
240-241 
292 320 



adult heart 



GIBCO 



AHR001 



3 17-21 
105-110 
139 141 
182 186 
213 215 



27 32 74 76 
117 121 124 
148 151-153 
190 193 198 
222 



85 89-91 95-96 102-103 
-125 128 131 134-136 
155-156 161 163 181- 
200-201 205 207 211- 
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Table 1 



Tissue Origin 


RNA/Tissue 


Library 


SEQ 


ID NO: 


















Source 


Name 




























225 


229- 


230 


234 


251- 


254 


257- 


-259 


263 


274- 








277 


280 


292- 


-297 


301 


303- 


-304 


315- 


-316 


319 








329 


-331 


345 


359 


384 


417 


423- 


-424 






adult kidney 


Invitrogen 


AKT002 


3 6 


14 20-21 25- 


-26 76 79 85 


89 94 101 111 








114 


118 


121 


124 


126 


130- 


-131 


138 


146 


163 








170 


177- 


178 


189 


196 


198 


201 


204 


213 


231 








253 


-254 


256- 


-259 


271 


273- 


-275 


277 


298 


315 








320 


329 


342 
















adult lung 


GIBCO 


ALG001 


4 29 74 


79 85 90 96 


105 


111 


119 


132 


134 






136 


142 


144 


149 


159 


181 


189 


198 


200 


205- 








207 


226 


255 


257 


263 


283 


294 


300 


302- 


-303 








328 


358- 


359 


365 


426 












lymph node 


CI on tech 


ALN001 


6 16 31 


105 


120 


215 


257 


295 


306 


309 


359 | 


young 1 i ve r 


GIBCO 


ALV001 


J. u — 


11 25-26 


29 31 33 


76 


85 95 115 121-122 






1 OA 


126 


130 


143 


146 


156 


158 


164 


178 


182 








187 


189 


229 


248 


253- 


254 


261 


278 


283 


304 








342 


375 


















adult liver 


Invitrogen 


ALV002 


io- 


12 23 26 


31 33-34 38 


53 56 90-92 


94-95 








ns 


121 


124 


128- 


-129 


138 


141 


146 


148 


153 








156 


161 


171 


178 


198 


216 


232 


248 


253- 


-254 








256 


-257 


264 


302 


306 


365 


375 


383 


396 




adult liver 


Clontech 


ALV003 


10- 


11 156 171 188 












Ovarv 


Invitrogen 


AOV001 


3-8 


10-11 14 16 


19-22 24 27- 


-31 34 36 57 73 






75- 


76 81-82 


85 89-91 94- 


-98 104-109 111 








115 


-116 


121- 


-128 


130- 


131 


134 


136 


138- 


-139 








141 


143- 


■144 


146 


149- 


150 


152 


155 


157- 


-160 








163 


-166 


170- 


-173 


175 


177- 


-178 


180 


182 


184- 








187 


189- 


•190 


193 


-194 


196- 


-197 


200 


-201 


212- 








213 


215 


217 


222 


225- 


226 


228 


230- 


-233 


235 








241 


-243 


245 


248 


253- 


•259 


261 


266 


-267 


270 








272 


-273 


276- 


-278 


283- 


•285 


287 


289 


292 


297- 








299 


305- 


•306 


315 


-317 


319 


323« 


-325 


329- 


-331 








341 


343- 


344 


352 


358- 


•359 


363- 


-366 


382- 


-383 








386 


389- 


•390 


412 














Placenta 


Invitrogen 


APL001 


73 


92 117 135 182 194 232 246 2 


51 272 282 








359 




















placenta 


Invitrogen 


APL002 


16 


28 92 121 135 144 157 178 210 394 


adult spleen 


GIBCO 


ASP001 


3-4 


16 32-33 35 


90 96 99-100 123-125 128 






131 


134 


136 


139 


151 


178 


181 


189 


194 


200 








210 


218 


229 


251 


253- 


•255 


257 


276 


283 


307- 








309 


315 


329 


354 


-355 


357 


392 


400 




1 


testis 


GIBCO 


ATS001 


22 


73 82 91 


96-97 104-105 117 124 130 134 








164 


173 


200 


209 


222 


233 


241 


253 


-254 


257 








285 


287- 


288 


305 


325 


329 


351 


-353 


359 




bladder 


Invitrogen 


BLD001 


4 108 130 150 212 226 236 240 242 257 276 








287 


305 


395- 


-396 


415 












bone marrow 


Clontech 


BMD001 


1 4 


-5 22 29 


-30 


34 72 85 


88 


90 92 94 


98 | 








104 


-107 


109 


111 


113 


117 


120 


123 


-125 


128- 








129 


132 


135 


140 


142 


144 


146 


152 


163 


165- 








166 


170- 


-173 


177 


180 


182 


186 


189 


-190 


198- 








209 


215 


222 


225 


232 


240- 


-246 


251 


-252 


260- 








261 


273- 


-275 


277 


-280 


283- 


-285 


300 


316 


318 








346 


-347 


359 
















bone marrow 


GF 


BMD002 


1 4 


7-8 


10- 


11 16 19 


25 31 49 61 


-62 


72 74 








76 


80 85 88 


90 


93-95 97- 


-101 


109 


-no 


112 








114 


116- 


•117 


121 


126 


129 


132 


135 


141 


144 
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Table 1 



Tissue Origin 


RNA/ Tissue 


Library 


SEQ XD NO: 




Source 


Name 










146 149-150 154 lb/ lou Xo2-XoJ Xo5-X0o 








X70— X/2 i/5 X/o— XoU loz-loj Xoo-xyu X^Z~ 








194 198-200 203 208 210-213 215 223 22b 








234 242 245 247 251-254 256-257 265 2/0 








273 276-278 280 285 287 289 291 293-294 








299 302 307 309 315 322 324 337-338 353 








356-357 359 367 369 388 407 414 419 426 








434 


bone marrow 


Clonetech 


BMD007 


144 


♦Mixture of 


VARIOUS 


CGdOlO 


1 34-35 95 152 161 171 182 206 219 242 260 


16 tissues - 


VENDORS 




267 276 280 288 297 300 315-316 412 


mRNA 








♦Mixture of 


Various 


CGdOll 


45 51 167 188 216 251-252 


16 tissues - 


Vendors 






mRNA 








♦Mixture of 


Various 


CGd012 


2 10-11 18-21 29 31 34-35 40 42-43 45 48 


16 tissues - 


Vendors 




50-52 69-71 87-89 94-95 98-105 109 111-113 


mRNA 






117 120 123 125 127 131 135-136 138 146 








158 163 165-169 175 180 187-188 191 198 








201 208 216 219-221 224 226 234 236 238- 








239 241-246 251-252 260 264 270 276-277 








279 281 283-284 287 295-296 314 319 321 








327-328 331 333-334 337-341 343 351-352 








361 365 369 379-380 387 389 395 397-399 








402 406 410-412 417 419 424 426 431-433 


♦Mixture of 


Various 


CGd013 


29 48 101 146 167-169 187 219 234 327 333 


16 tissues - 


Vendors 




339 341 365 412 433 


mRNA 








♦Mixture of 


Various 


CGd015 


29 86 90 95 98 110 113 118 132 158 171 184 


16 tissues - 


Vendors 




193 218-220 243 284 310 385 410 419 


mRNA 








♦Mixture of 


Various 


CGd016 


3-4 20-21 29 38 85 88-89 95 105 119 122 


16 tissues - 


Vendors 




131-133 140 185 211-212 225 256-257 273 


mRNA 






276 302 318 379-380 390 400 419 


colon 


Invitrogen 


CliNOOl 


4 25 33 85 138 146 148 158-159 198 210 229 








301 360 384 397 


cervix 


BioChain 


CVX001 


3 5 10-11 18 20-21 24-25 29 36 41 47 57 63 








72 74 76 86 90 94 104 108-109 111 125 127 








130 134 138 144 147 162 174 178-179 182 








186 189 193 197 211 222 225-226 228 232 








241 243 257 261 267 270 273-275 278-281 








288-289 298 301-302 305 315 319 324-325 








329 331 337-338 359 391-392 395 420 s 


endothelial 


Strategene 


edtu U X 


1R-1Q OA 97-29 35 72 76 79-80 85 89 96 


cells 






98 104-107 111 117 119-121 124-131 134 136 






138-139 141 144 146-147 149 152 158-159 








166-167 170-173 178-179 182-183 186-187 








191 193-194 196-197 200 210-211 222-224 








226 231-232 236 241 243 246 248 253-256 








258-259 276 279 282 287 292 300 302-303 








315 329 337-338 358-362 382-383 385-388 


esophagus 


BioChain 


ESO002 


257 


fetal brain 


Clontech 


FBR001 


34 


fetal brain 


Clontech 


FBR004 


3 139 144 271 284 337-338 


fetal brain 


Clontech 


FBR006 


4 6-11 14 18-21 24 28 31 37-38 40 63 76 85 








87 89-90 94-95 97 105 108-109 112-113 115 
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Table 1 



Tissue Origin 



RNA/Tissue 
Sourch 



Library 



SEQ ID NO: 



117-120 127 
170 172 175 
199 201 203 
232-233 240 
281 288-289 
330-331 356 
380 383 389 
419 421 423 



-130 133 
180 182 
209-210 
243 245 
292 295 

-357 359 
397 399 



138 140 
186-188 
215 219 
253-255 
304 315 
-360 364 
-401 408 



144-146 148 
190 192 194 
222 229-230 
270 273 276 
317 319 324 
367-368 379- 
-409 411 413 



fetal brain 



Invitrogen 



FBT002 



2 14 19 23 28 31 90 94 105 121 124 126 131 
135 139 142 149 158 186 193 198 210 214- 
215 232 239 242 248 255 267 326 332 365 
369 371 376-383 394 399 



fetal heart 



Invitrogen 



FHR001 



4 7-8 10-11 14 17-21 28-29 31-32 60 64-65 
73 85 87 92 95 102-103 105 108 111 113 117 
119 121 125 128-129 134-135 141 152 154 
156-157 160-161 172 176 178 194 196 198- 
200 203 208 212 215 218 222 226 229 233- 
234 253-257 261 265 272 276 281 292-293 
295 303 305 319 325 327 337-338 341 345 
349 354-355 367-368 389 395-396 398 412 
417 436 



fetal kidney 



Clontech 



FKD001 



1 14 22 94 110 115 132 134-135 146 178 189 
199 235-236 242 247 257 267 292 295 359 



fetal kidney 



Clontech 



FKD002 



22 31 38 40 46 94 122 127 131 156 160 194 
198 229 253-254 270 292 303 319 354-355 
389 396 



fetal kidney 



Invitrogen 



FKD007 



303 



fetal lung 



Clontech 



FLG001 



85 89 98-100 111 175 271 281 369 



fetal lung 



Invitrogen 



FLG003 



84 88 106-107 122 135 140 146 160 181 246 
272 284 292 328 330 396 404 416 426 



fetal liver- 
spleen 



Soares 



FLS001 



1-3 6-12 14 19 23 28-31 33 57 59-60 72-76 
78 80 83 85-138 140-141 143-144 146-155 
157-161 163-197 200 204 208 210-211 223 
225 230 232-233 235 241-243 245-266 268- 
273 277 281 285-287 292 297 303 314 329 
343 346-347 357-359 369 397 399 407 415 



fetal liver- 
spleen 



Soares 



FLS002 



1 3-4 6 10-12 23-24 29 31-33 35-37 53-54 
74-76 79 81-82 86-89 91 94-95 99-104 106- 
109 111-112 115 117-120 122 125-126 128- 
129 132 134 136-138 141 146 149 153 157- 
159 162-166 170 172 175 178-180 183 185- 
191 194 196-197 205 207-212 222-225 228 
232-233 239-241 248 251-252 255-256 258- 
259 261-262 264 266-267 270-271 273-275 
277-278 283 285 287 298 305 315 317-318 
322 330-332 337-338 341 343 349 357-360 
365 388 390-391 399 402 418 424 



fetal liver- 
spleen 



Soares 



FLS003 



12 29 91 98 111 119 156 163 165 178 186 
193 210-211 276 286 315 322 346-347 357 
365 424 



fetal liver 



Invitrogen 



FLV001 



7-8 14 35 118 122-123 129 146 182 211 230 
232 248 251-252 264 287 304 337-338 344 
346-347 352 365 367-369 



fetal liver 



Clontech 



FLV002 



102-103 147 149 300 



fetal liver 



Clontech 



FLV004 



73 85 105 108 118 122 126 141 156-157 161 
165 170 178 180 182 194 215 218 225 240 
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Table 1 



Tissue Origin 


RNA/Tissue 


Library 


SEQ 


IB NO: 


















Source 


Name 




























242 


247 


251- 


252 


292 


330 


337- 


338 


369 


407 








411 


440 


















fetal muscle 


Invitrogen 


FMS002 


5 9 


17-18 2C 


-21 


29 38 85 88 


97 106-107 129 








131 


136 


150- 


•152 


155 


165 


170 


179 


182 


192- 








193 


212- 


-213 


229 


234 


242 


258- 


259 


270 


282 








286 


289 


300 


316 


319 


345 


351 


354- 


•355 


360 








389 


396 


408 


410 


437 


439 










fetal skin 


Invitrogen 


FSK001 


2 4 


7-8 


29 33 42-43 


49 51-52 58 


74 82 85 








90 94 110-111 116 118 121 133 136 138-139 








145 


151 


154 


156- 


-157 


161- 


-162 


172 


181 


184 








186 


193 


198 


200 


205 


207 


209- 


-211 


222 


227- 








230 


232 


235 


240 


246 


253- 


-257 


266 


270 


276 








292 


295 


299 


316 


318 


323 


330 


332 


337- 


-340 








343 


357 


369 


389 


394- 


-395 


412 


422 


427 




fetal skin 


Invitrogen 


FSK002 


4 9 


42 44 51 66 


72 81 85 89- 


-90 95 98 105 








112- 


-114 


119 


121 


129 


133 


135 


162 


172 


179- 








182 


197 


200 


208 


210 


231 


243- 


-244 


272 


304 








316 


330 


339 


354- 


-355 


357 


360 


389 


395 


410 








417 


437 


















fetal spleen 


BioChain 


FSP001 


157 


223 


















umbilical 


BioChain 


FUC001 


4-6 


20-21 25 29 


73-74 83 87 


89-91 94 101 


cord 






109 


120 


123 


125 


128 


130- 


-131 


133 


141 


143- I 






144 


147 


149 


154 


161 


165 


173 


175 


179 


184 








188 


210- 


-212 


217 


226 


235 


240 


248 


251- 


-252 








257 


262 


267 


270 


277 


293 


305 


307 


316 


319 








323 


327 


331 


341 


356 


359 


389 


392 


407 


416 


fetal brain 


GIBCO 


HFB001 


2-4 


16 20-21 74 


77 85 89-91 


96-98 104-105 








111 


114 


118 


121- 


-122 


124- 


-125 


127- 


-128 


131 








134 


137- 


-140 


142 


144 


146- 


-148 


151 


153 


158- 








159 


163- 


-164 


166 


173 


178 


180 


182 


191 


194 








196 


200 


203 


209- 


-214 


216- 


-232 


234- 


-236 


238- 








239 


243 


253- 


-255 


263 


270 


272- 


-273 


276 


281 








292 


310 


316 


319- 


-321 


332 


348 


357 


359 


365 








399 




















macrophage 


Invitrogen 


HMP001 


2 247 


infant brain 


Soares 


IB2002 


2-4 


7-8 


19-22 26-27 


31-32 35 73- 


-74 80 85 








89 91 96-98 


106- 


-107 


110 


112 


118 


-119 


121- 








122 


125 


128- 


-131 


134- 


-144 


148 


153 


164 


166 








172- 


-173 


177 


180 


186 


-187 


191- 


-194 


196 


202- 








203 


208- 


-210 


217 


219 


223 


-224 


227 


229 


232- 








234 


236 


-237 


239 


241 


-243 


245 


248 


253 


-259 








273« 


-275 


278- 


-279 


282 


287 


294 


298 


309 


314 








317 


322 


327 


330 


333 


-334 


341 


348 


-350 


360 








368 


376 


379- 


-380 


382 


396 


406 


424 






infant brain 


Soares 


IB2003 


3-4 


20-21 26 28 


31 


35 73 85 


95- 


96 110 113 








119 


122 


-123 


130 


-131 


135 


138 


140 


142 


-143 








146 


153 


155 


170 


172 


-173 


186 


191 


-193 


196 








209 


219 


223 


226 


229 


233 


-234 


236 


239 


245 








248 


253 


-254 


256 


-257 


273 


279 


291 


-292 


304 








314 


337 


-338 


343 


359 


367 


371 


376 


397 


413 


lung, 


Strategene 


LFB001 


3 6 


31 


72-73 90 


92 


105- 


107 


124 


126- 


127 133 


fibroblast 






136 


139 


144 


146 


172 


189 


198 


204 


233 


235 








246 


258 


-259 


268 


272 


276 


282 


310 


335 


359 








434 




















adult lung 


Invitrogen 


LGT002 


4 19-21 


28 


33 35-36 


49 


72 79 81 


85 


88 90- 






91 


94-95 101 106-107 109 118 120-125 127 
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130 


-131 


133 


135 


-138 


141- 


-142 


144 


147 


149 








157 


159- 


161 


163 


166 


170 


-173 


193- 


-194 


196- 








197 


212 


216 


218 


221 


223 


226 


228- 


229 


231 








233 


241 


247-248 


253- 


-255 


257 


261 


266- 


•267 








270- 


-275 


277- 


278 


282- 


-283 


292 


298 


301 


303 








315 


318 


324 


331 


335 


354 


-355 


359 


367 


369 








381 


392- 


•393 


398 














leukocytes 


GIBCO 


LUC001 


1-5 


15 19-21 28 


30-33 37 72 


74 91 94-95 








97-100 108-1 


09 


113 115 117 


119-122 124-125 








127- 


-128 


134- 


•138 


141 


144 


146 


-148 


150- 


•151 








157< 


-158160 162- 


167 170-173 175-178 180-181 








187 


189 


192 


194 


197 


200 


212 


-213 


215- 


-216 








218 


-219 


223 


225 


228- 


-232 


241 


-242 


245- 


•246 








251 


-254 


261 


272 


-276 


278 


-282 


284 


2.87- 


•290 








297 


-298 


305 


307 


310- 


-314 


325 


331 


336 


340 








358 


-359 


372 


399 


414 












leukocytes 


Clontech 


LUC003 


1 5 


124 


171 


176 


204 


225 


248 


253- 


•254 


283 








285 


307 


315 
















melanoma 


Clontech 


MEL004 


4-5 


24 2 


7 72 


-74 


81 85 106-107 113 12 


6 177 








203 


205- 


-207 


209 


231 


243 


284 


-285 


315- 


•316 








320 


326 


359 


374 


428 












mammary gland 


Invitrogen 


MMG001 


2 4 


-5 7- 


•8 10-12 


29 31 34-35 


38 50 80-81 85 








89-90 92 94- 


•97 


105 108-109 


119-124 126 








128 


-130 


135 


138 


-139 


141 


-142 


144 


146- 


•147 








153 


155 


157- 


•159 


163 


178 


-179 


181- 


182 


198 








200 


209- 


■210 


219 


223 


228 


230 


232- 


•233 


235- 








236 


239 


242 


248 


253- 


-255 


257 


260- 


261 


265- 








267 


270 


272 


281 


287 


292 


294 


315- 


316 


318 








324 


327 


330 


337 


-340 


354 


-355 


357 


369 


372 








383 


392- 


•395 


401 


404 












neuron 




MTTlfl ft T 
iNlJJUU J. 


35 47 8S 


-90 


111 


118 


164 


232 


253- 


•254 


276 








324 


331 


382 
















neuron 




ivi KUU J. 


20-21 3-7 


122 


147-149 170 179 181 


186 212 








226 


258- 


•259 


265 


276 


369 


436 


438 






neuronaj. 


o era Ley ciic 


KFPTTft ft 1 


7-8 


37 55 80 85 


112 


118 


126 


-127 


133 


138 


cells 






140 


-141 


151 


170 


181 


210 


214 


225- 


226 


236 








243 


287 


328 


330 


-331 


357 


383 


400 


436 




pitui Laiy 




PTTftftA 


92 124 159 231 














gland 


























placenta 


u louuecn 


BT Aftft^ 


34 46 88' 126 128 159 182 186 197 201 267 






278 


281- 


•282 


305 


330 


356 


361 


365 


418 




prostate 


i_ ion teen 


DDTftft 1 


18 36 72 74 


86 


95 106-107 111 118 122 144 








161 


179 


211 


218 


233 


286 


297 








rectum 


Tnvi t rocreii 


REC001 


9 31 85 


121 


128 


147 


171 


200 


219 


257 


292 








340 


394 


398 


407 


412 












salivary 


Clontech 


SAL001 


3 24 38 


80 122 


136 


147 


189 


241 282 296 310 


gland 






351 


392 


395 


415 














saliva gland 


Clontech 


SALS 03 


118 


small 


Clontech 


SIN001 


12 


16 25 82- 


•83 


B9-90 93 


95 


98 105-109 111 


intestine 






122 


-123 


125- 


-128 


133 


-134 


137 


139 


142 


161 








167 


171 


184 


197 


201 


204 


212 


218 


236 


242- 








243 


248- 


-249 


253 


-254 


257 


267 


276 


284- 


-285 








292 


297 


300 


303 


310 


313 


317 


-318 


325 


340 








343 


352 


354- 


-355 


359 


383 


391 


416 






spinal cord 


Clontech 


SPC001 


3 39 84 


86 94 96 105 115 117 130-131 134 








136 


141 


143 


148 


155 


176 


190 


-191 


203 


213 



128 
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224 


233- 


234 


236 


239 


279 


283 


298 


320- 


•321 








332 


336- 


338 


356 


359 


365 


404- 


406 






thalamus 


Clontech 


THA002 


2 20-21 


23 74 81 


85 


105- 


-106 


116 


121 


131 ! 






146 


171 


185 


188 


200 


209 


219 


233 


239 


256 








258- 


-259 


273 


276 


362 


399 










thymus 


Clonetech 


THM001 


16 29 33 57 


80 82 85 90 


93-94 106-107 120 






126 


128 


134 


141 


161 


176 


194 


223 


228 


235 








253- 


-254 


261 


274- 


275 


278 


285 


298 


319 


332 








336 


343 


353 


359 


425 












thymus 


Clontech 


THMC02 


1-2 


7-9 


14 26 34 


44 


73 75 82 85 


87 94 98 






106- 


-107 


109 


-111 


117 


119- 


-120 


125- 


-126 


128- 








129 


139 


141 


144 


147- 


-148 


151 


154- 


-155 


162 








165 


170- 


-172 


175- 


•176 


179 


182 


186 


193- 


-194 








199 


-200 


208 


-209 


213 


218 


233 


235 


240 


242 








247 


253- 


-254 


257 


265 


276 


281 


287 


290 


305 








307 


312 


319 


336 


342 


354- 


-356 


359 


364 


367 








399 


408 


412 


-413 


415 


419 


421 


426 


429- 


-433 


thyroid gland 


Clontech 


THRO 01 


3 5 


7-8 


28 


30-31 33 


73-77 80 82 


85 88 90- 






92 


94 96-98 


105- 


-107 


109 


113 


117 


121- 


-122 








124 


-125 


127 


-128 


130 


134 


136 


141 


143 


146- 








148 


152 


161 


-163 


166 


175 


177- 


-178 


181 


194 








199 


201 


204 


210 


212 


216 


218 


223 


-226 


228 








230 


-231 


234 


236 


241 


243 


246 


253 


-257 


261 








270 


272- 


-273 


276- 


-278 


281 


-283 


287 


292 


295 








298 


303- 


-304 


308 


315 


323 


329 


335 


352 


359 








362 


401 


416 


-417 














trachea 


Clontech 


TRC001 


88 


138 180 


226 228 279 


359 411 


436 




uterus 


Clontech 


UTR001 


3 10-11 


23 


77 92 106-107 109 111 141 197- 








198 


218 


241 


257 


270 


274 


-275 


302 


315 


329 








396 


400 


413 

















*The 16 tissue/mRNAs and their vendor sources are as follows: 1) Normal adult brain 
mRNA (Invitrogen), 2) Normal adult kidney mRNA (Invitrogen), 3) Normal fetal brain mRNA 
(Invitrogen), 4) Normal adult liver mRNA (Invitrogen), 5) Normal fetal kidney mRNA 
(Invitrogen), 6) Normal fetal liver mRNA (Invitrogen), 7) normal fetal skin mRNA (Invitrogen), 8) 
human adrenal gland mRNA (Clontech), 9) Human bone mairow mRNA (Clontech), 10) Human 
leukemia lymphoblastic mRNA (Clontech), 11) Human thymus mRNA (Clontech), 12) human 
lymph node mRNA (Clontech), 13) human so\spinal cord mRNA (Clontech), 14) human thyroid 
mRNA (Clontech), 15) human esophagus mRNA (BioChain), 16) human conceptional umbilical 
cord mRNA (BioChain). 
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SEQ ID NO: 


Accession No. 


Species 


Description 


Score 


% 
laenury 


J 




- — : 

Homo sapiens 


r : — "j r 3 

membrane-associated nucleic acid 

binding protein mRNA, partial cds. 






1 


gi7020305 


Homo sapiens 


cDNA FU20301 fis, clone HEP06569. 


1728 


47 


1 


gi7294120 


Drosophila 
melanogaster 


CG16807 gene product 


1535 


53 


2 


AAY57911 


Homo sapiens 


Human transmembrane protein 
HTMPN-35. 


1258 


82 


2 


AAB88406 


Homo sapiens 


Human membrane or secretory protein 
clone PSEC0162. 


265 


39 


2 


gil4272664 


Homo sapiens 


unnamed protein product 


265 


39 


3 


gil2654575 


Homo sapiens 


Similar to gp25L2 protein, clone 
MGC:2142 IMAGE:2967520, mRNA, 
complete cds. 


1116 


100 


3 


gil2845568 


Mus musculus 


putative 


1099 


98 


3 


gi996057 


Homo sapiens 


H.sapiens mRNA for gp25L2 protein. 


1096 


98 


4 


gi9971050 


Homo sapiens 


Human DNA sequence from clone 
RP1 1-526K24 on chromosome 20. 
Contains a novel gene, the 5* end of a 
novel gene, two CpG islands, ESTs, 
GSSs and STSs, complete sequence. 


4348 


99 


4 


AAB95086 


Homo sapiens 


Human protein sequence SEQ ED 
NO: 16999. 


3034 


99 


4 


gil0433753 


Homo sapiens 


cDNAFU 12307 fis, clone 
MAMMA1001908. 


3034 


99 


5 


gi4689106 


Homo sapiens 


NADH-ubiquinone oxidoreductase B8 
subunit 


505 


100 


5 


gi2909862 


Homo sapiens 


NADH-ubiquinone oxidoreductase 
subunit CI-B8 mRNA, complete cds. 


505 


100 


5 


gil2539408 


Homo sapiens 


NDUFA2 gene for NADH 
dehydrogenase (ubiquinone) 1 alpha 
subcomplex 2, complete cds. 


505 


100 


6 


AAG64416 


Homo sapiens 


Human nucleoprotein. 


3765 


100 


6 


gil0443046 


Homo sapiens 


Human DNA sequence from clone 
RP1 1-465L10 on chromosome 20. 
Contains 10 CpG islands, ESTs, STSs 
and GSSs. Contains the gene for a 
novel protein similar to Drosophila 
CGI 1399, the gene for a novel C2H2 
type zinc finger protein similar to 
chicken FZF-1, a Femtm light 
polypeptide (FTL) pseudogene, me 
MMP9 gene for matrix 
metalloproteinase 9 (gelatmase B, 
92kD gelatinase, 92kD type IV 
collagenase) (CLG4B), a novel gene, 
the SLC12A5 gene for solute carrier 
family 12, (potassium-chloride 
transporter) member 5 (KIAA1176) 
and the 3' end of gene KIAA1637, 
complete sequence. 


3765 


100 


6 


gil5426514 


Homo sapiens 


clone MGC: 16205 IMAGE:3640928, 
mRNA, complete cds. 


3765 


100 


7 


AAG64416 


Homo sapiens 


Human nucleoprotein. 


3366 


100 
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7 


gil0443046 


Homo sapiens 


Human DNA sequence from clone 
RP1 1-465L10 on chromosome 20. 
Contains 10 CpG islands, ESTs, STSs 
and GSSs. Contains the gene for a 
novel protein similar to Drosophila 
CGI 1399, the gene for a novel C2H2 
type zinc finger protein similar to 
chicken FZF-1, a Ferritin light 
polypeptide (FIX) pseudogene, the 
MMP9 gene for matrix 
metalloproteinase 9 (gelatinase B, 
92kD gelatinase, 92kD type IV 
coUagenase) (CLG4B), a novel gene, 
the SLC12A5 gene for solute carrier 
family 12, (potassium-chloride 
transporter) member 5 (KIAA1 176) 

nryA +Vi« 0 * cm A «f nana VTA A 1 £3*7 

ana ine o ena 01 gene i 03 / , 
complete sequence. 


3366 


100 


7 


gil5426514 


Homo sapiens 


clone MGC:16205 IMAGE:3640928, 
mRNA, complete cds. 


3366 


100 


8 


gil4571904 


Rattus 
norvegicus 


lysosomal amino acid transporter 1 


2145 


85 


8 


AAE04910 


Homo sapiens 


Human transporter and ion channel-23 
(TRICH-23) protein. 


1239 


56 


8 


gi7297404 


Drosophila 
melanogaster 


CG 13384 gene product 


837 


43 


9 


AAB73686 


Homo sapiens 


Human oxidoreductase protein ORP- 
19. 


1301 


98 


9 


gi7291405 


Drosophila 
melanogaster 


T3dh gene product 


808 | 


59 


9 


gi5824752 


Caenorhabditis 
elegans 


predicted using Genefindep-contains 
similarity to Pfam domain: PF00465 
(Iron^ontaining alcohol 
dehydrogenases), Score=177.7, E- 
value=1.9e-50, N=2~cDNA EST 
EMBL:Z14517 comes from this gene; 
cDNA EST ykl8d4.3 comes from this 
gene~cDNA EST ykl 8d4.5 comes 
from this gene; cDNA EST ykl 16f5.5 
comes trom this gene^cJJN A h.o l 
ykl32h3.3 comes from this gene; 
cDNA EST yk73dl0.3 comes from this 
gene-cDNA EST yk93e9.3 comes from 
this gene; cDNA EST ykl32h3.5 
comes from this gene-cDNA EST 
yk73dl0.5 comes from this gene; 
cDNA EST yk93e9.5 comes from this 
gene-cDNA EST ykl35b6.5 comes 
from this gene; cDNA EST ykl35b6.3 
comes from this gene~cDNA EST 
yk201e5.3 comes from this gene; 
cDNA EST yk268bl.3 comes from this 
gene-cDNA EST yk261d6.3 comes 


685 


52 
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from this gene; cDNA EST yk262hl 1.3 
comes from this gene—cDNA EST 
yk292hl 1.3 comes from this gene; 
cDNA EST yk304d83 comes from this 
gene-cDNA EST yk344b7.3 comes 
from this gene; cDNA EST yk35 la6.3 
comes from this gene~cDNA EST 
yk366d9.3 comes from this gene; 
cDNA EST yk368e3.3 comes from this 
gene-cDNA EST yk372cl 1.3 comes 
from this gene; cDNA EST yk389g3.3 
comes from this gene-cDNA EST 
yk422d2.3 comes from this gene; 
cDNA EST yk381d73 comes from this 
gene-cDNA EST yk201e5.5 comes 
from this gene; cDNA EST yk267f6.5 
comes from this gene-cDNA EST 
yk268bl.5 comes from this gene; 
cDNA EST yk261d6.5 comes from this 
gene-cDNA EST yk262hl 1.5 comes 
from this gene; cDNA EST yk292hl 1.5 
comes from this gene~cDNA EST 
yk304d8.5 comes from mis gene; 
cDNA EST yk344b7.5 comes from this 
gene-cDNA EST yk368e3.5 comes 
from this gene; cDNA EST yk372cl 1.5 
comes from this gene~cDNA EST 
yk351a6.5 comes from this gene; 
cDNA EST yk366d9.5 comes from this 
gene-cDNA EST yk389g3.5 comes 
from this gene; cDNA EST yk422d2.5 
comes from this gene— cDNA EST 
yk560f4.3 comes from mis gene; 
cDNA EST yk625h5.3 comes from this 
gene-cDNA EST yk381d7.5 comes 
from this gene; cDNA EST yk560f4.5 
comes from this gene~cDNA EST 
yk625h5.5 comes from this gene 






10 


AAB73686 


Homo sapiens 


Human oxidoreductase protein ORP- 
19. 


1552 


99 


10 


gi7291405 


Drosophila 
melanogaster 


T3dh gene product 


891 


56 


10 


gi5824752 


Caenorhabditis 
elegans 


predicted using Genefinder~contains ■ 
similarity to Pfam domain: PF00465 
(Iron-containing alcohol 
dehydrogenases), Scote^mJy E- 
value=1.9e-50, N=2~cDNA EST 
EMBL:Z14517 comes from this gene; 
cDNA EST ykl8d4.3 comes from this 
gene-cDNA EST ykl8d4.5 comes 
from this gene; cDNA EST ykl 16£5.5 
comes from this gene— cDNA EST 
ykl32h33 comes from this gene; 
cDNA EST yk73dl03 comes from this 


730 


51 
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gene-cDNA EST yk93e9.3 comes from 
this gene; cDNA EST ykl32h3.5 
comes from this gene-cDNA EST 
yk73dl0.5 comes from this gene; 
cDNA EST yk93e9.5 comes from this 
gene-cDNA EST ykl35b6.5 comes 
from this gene; cDNA EST ykl35b6.3 
comes from this gene~cDNA EST 
yk201e5.3 comes from this gene; 
cDNA EST yk268bl.3 comes from this 
gene-cDNA EST yk261d6.3 comes 
from this gene; cDNA EST yk262hl 13 
comes from this gene-cDNA EST 
yk292hl 1.3 comes from this gene; 
cDNA EST yk304d8.3 comes from this 
gene-cDNA EST yk344b7.3 comes 
from this gene; cDNA EST yk351a63 
comes, from this gene~cDNA EST 
yk366d9.3 comes from this gene; 
cDNA EST yk368e3.3 comes from this 
gene~cDNA EST yk372cl 1 .3 comes 
from this gene; cDNA EST yk389g3.3 
comes from this gene-cDNA EST 
yk422d2.3 comes from this gene; 
cDNA EST yk381d7.3 comes from this 
gene-cDNA EST yk201e5.S comes 
from this gene; cDNA EST yk267f6.5 
comes from this gene-cDNA EST 
yk268bl.5 comes from this gene; 
cDNA EST yk261d6.5 comes from this 
gene-cDNA EST yk262hl 1 .5 comes 
from this gene; cDNA EST yk292hl 1.5 
comes from this gene~cDNA EST 
yk304d8.5 comes from this gene; 
cDNA EST yk344b7.5 comes from this 
gene-cDNA EST yk368e3.5 comes 
from this gene; cDNA EST yk372cl 1.5 
comes from mis gene~cDNA EST 
yk351a6.5 comes from this gene; 
cDNA EST yk3ood9.5 comes from this 
genercDNA EST yk389g3.5 comes 
from this gene; cDNA EST yk422d2.5 
comes from this gene-cDNA EST 
yk560f43 comes from this gene; 
cDNA EST yk625h5.3 comes from this 
gene-cDNA EST yk381d7.5 comes 
from this gene; cDNA EST yk560f4.5 
comes from this gene—cDNA EST 
yk625h5.5 comes from this gene 






11 


AAB85166 


Homo sapiens 


Human Bcl-Gl polypeptide. 


1598 


87 


11 


gil4598300 


Homo sapiens 


unnamed protein product 


1598 


87 


11 


gil2584085 


Homo sapiens 


apoptosis regulator BCL-G long form 
(BCLG) mRNA, complete cds. 


1598 


87 


12 


gil5077865 


Mus musculus 


bullous pemphigoid antigen 1-b 1253 


82 
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0/ 

Identity 


12 


gi!5077863 


Mus musculus 


bullous pemphigoid antigen 1-a 


1253 


82 


12 


gi6624582 


Homo sapiens 


Human DNA sequence from clone 
RP1-61B2 on chromosome 6pl 1.2-123 
Contains iso forms 1 and 3 of BP AG 1 
(bullous pemphigoid antigen 1 
(230/240x0), an exon of a gene similar 
to murine MACF cytoskeletal protein, 
STSs and GSSs, complete sequence. 


733 


99 


13 


gi3702270 


Homo sapiens 


chromosome 19, cosmid R3 1408, 
complete sequence. 


887 


93 


13 


gi401845 


Homo sapiens 


ribosomal protein LI 8a mRNA, 
complete cds. 


887 


93 


13 


gil3960144 


Homo sapiens 


ribosomal protein LI 8a, clone 
MGC.-4476 IMAGE:2961519 a mRNA, 
complete cds. 


887 


93 


14 


AAB59090 


Homo sapiens 


Breast and ovarian cancer associated 
antigen protein sequence SEQ ID 798. 


496 


80 


14 


A A A A 1 r\ 

AAB44129 


Homo sapiens 


Human cancer associated protein 
sequence SEQ ID NO: 1574. 


453 


O 1 

HI 


14 


gil4198321 


Mus musculus 


ribosomal protein L3 1 


453 


81 


15 


gi5689465 


Homo sapiens 


mRNA for KIAA1064 protein, partial 
cds. 


5643 


inn 


15 


gi4884368 


Homo sapiens 


_.DXTA . —TVKT A r\VC7-COn 1 11A 

mRNA; cDNA DKJFZp586L1220 
(from clone DKFZp586L1220); partial 
cds. 


Xozo 


inn 


15 


gil3161145 


Homo sapiens 


zinc ringer protein mRNA, complete 
cds. 


369 


36 


16 


gi5870832 


Mus musculus 


skm-BOrl 


1AGA 


OA 


16 


gi5870834 


Mus musculus 


skm-BOP2 


2397 


91 


16 


gi 1809322 


Mus musculus 


t-BOP 


22oD 




17 


gil3938126 


Mus musculus 


RIKEN cDNA 3732409C05 gene 


2678 


98 


17 


gil2852375 


Mus musculus 


putative 


2678 


98 


17 


gi7024433 


Toipedo 
maimorata 


male sterility protein 2-like protein 


2307 


80 


18 


AAB95482 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 18007. 


1572 


67 


18 


gil4042809 


Homo sapiens 


cDNA FU14932 fis, clone 

T\T A /WW*. AAA S- *\ A 

PLACE1009639. 


1572 


67 


18 


gil2053165 


Homo sapiens 


mRNA; cDNA DKFZp434K0427 
(rrom clone JL)KrZp434K0427j; 
complete cds. 


1572 


67 


19 


gi7243159 


Homo sapiens 


mRNA for KIAA1389 protein, partial 
cds. 


7842 


99 


19 


gi4151328 


Homo sapiens 


high-risk human papilloma viruses E6 
oncoproteins targeted protein E6TP1 
alpha mRNA, complete cds. 


3777 


53 


19 


gi4151330 


Homo sapiens 


high-risk human papilloma viruses E6 
oncoproteins targeted protein E6TP1 
beta mRNA, complete cds. 


3768 


53 


20 


gi7243159 


Homo sapiens 


mRNA for KIAA1389 protein, partial 
cds. 


7714 


98 


20 


gi4151328 


Homo sapiens 


high-risk human papilloma viruses E6 
oncoproteins targeted protein E6TP1 


3806 


54 
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Score 


o/ 

/o 

taenury 








alpha mRNA, complete cds. 






20 


gi4151330 


Homo sapiens 


high-risk human papilloma viruses E6 
oncoproteins targeted protein E6TP1 
beta mRNA, complete cds. 


3797 


jj 


21 


AAB95328 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 17595. 


753 


61 


21 


AAB93757 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 13432. 


753 


61 


21 


AAB29657 


Homo sapiens 


Human membrane-associated protein 
HUMAP-14. 


753 


61 


22 


gi7673373 


Homo sapiens 


SCAN-related protein RAZ1 (RAZ1) 
mRNA, partial cds. 


1104 


100 


22 


AAG93274 


Homo sapiens 


Human protein HP10543. 


900 


100 


22 


AAB42846 


Homo sapiens 


Human ORFX ORF2610 polypeptide 
sequence SEQ ID NO:5220. 


900 


100 


23 


gi7242963 


Homo sapiens 


mRNA for KIAA1304 protein, partial 
cds. 


5409 


99 


23 


gi3413874 


Homo sapiens 


mRNA for KIAA0456 protein, partial 
cds. 


3695 


67 


23 


AAB30852 


Homo sapiens 


Amino acid sequence of human signal 
transduction protein SGT6-1. 


3685 


68 


24 


AAG64386 


Homo sapiens 


Human alcohol dehydrogenase 39. 


1228 


77 


24 


^il2861800 


Mus museums 


putative 


1083 


66 


24 


gi3878713 


Caenorhabditis 
elegans 


weak similarity with quinone 
oxidoreductase, contains similarity to 
Pfam domain: PF00107 (Zinc-binding 
dehydrogenases), Score— 80.6, E- 
value=6.2e-06, N=l~cDNA EST 
ykl64b4.5 comes from this 
gene~cDNA EST ykl64b4.3 conies 
from this gene-cDNA EST yk264f3.5 
comes from this gene 


556 


39 


25 


AAE02629 


Homo sapiens 


Human secreted protein Zalpha37. 


2481 


100 


25 


gil4536691 


Homo sapiens 


unnamed protein product 


2481 


100 


25 


AAY99419 


Homo sapiens 


Human PRO1780 (UNQ842) amino 
acid sequence SEQ ID NO:282. j 


1960 i 


77 


26 


gi6102869 


Homo sapiens 


mRNA; cDNA DKFZp434H1235 
(from clone DKFZp434H1235); partial 
cds. 


831 


100 


26 


gil2853439 


Mus musculus 


putative 




OA 


26 


gi2 198807 


Gallus gallus 


monocarboxylate transporter 3 


505 


29 


27 


gi7299069 


Drosophila 
melanogaster 


CGI 1755 gene product 


205 


34 


27 


gi3875367 


Caenorhabditis 
elegans 


contains 3 cysteine rich repeats 


136 


41 


27 


gi3249080 


Arabidopsis 
thaliana 


Contains similarity to MYB 
transcription factor isolog T01O24.1 
gb|2288980 from A. thaliana BAC 
gblAC002335. 


69 


35 


28 


gill041628 


Homo sapiens 


RPL6 gene for ribosomal protein L6, 
complete cds. 


1207 


98 


28 


gi433416 


Homo sapiens 


Human mRNA for DNA-binding 
protein, TAXREB107, complete cds. 


1207 


98 
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0/ 

/o 

Identity 


28 


gil3278717 


Homo sapiens 


ribosomal protein L6, clone 
MGC:1635 IMAGE:2823733, mRNA, 
complete cds. 


1207 


98 


29 


AAG03810 


Homo sapiens 


Human secreted pro tern, SEQ ID NO: 
7891. 


OA C 

845 


100 


29 


gil 86800 


Homo sapiens 


Human ribosomal protein LI 2 mRNA, 
complete cds. 


845 


100 


29 


gil4198333 


Homo sapiens 


ribosomal protein L12, clone 
MGC:9760 IMAGE:3855674, mRNA, 
complete cds. 


845 


100 


30 


AAB95051 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 16849. 


2965 


100 


30 


gil0433519 


Homo sapiens 


cDNAFU12118fis, clone 
MAMMA1000085, weakly similar to 
PUTATIVE CYSTEINYL-TRNA 
SYNTHETASE C29E6.06C (EC 
6.1.1.16). 


2965 


100 


30 


gil3938199 


Homo sapiens 


hypothetical protein FLJ121 18, clone 
MGC: 15044 IMAGE:2822557, mRNA, 
complete cds. 


2959 


99 


31 


gil2858123 


Mus musculus 


putative 


2441 


73 


31 


gi7959195 


Homo sapiens 


mRNA for KIAA1467 protein, partial 
cds. 


2232 


100 


31 


gil3278148 


Mus musculus 


Similar to RIKEN cDNA 84304 19L09 
gene 


794 


83 


32 


gil5530305 


Homo sapiens 


Similar to RIKEN cDNA 1700045119 
gene, clone MGC:2647 
IMAGE:3509621, mRNA, complete 
cds. 


1245 


84 


32 


gi9858803 


Mus musculus 


Zrp228 


512 


An 


32 


AAG75629 


Homo sapiens 


Human colon cancer antigen protein 
SEQ1DN0:6393. 


511 


46 


33 


gi8101071 


Homo sapiens 


golgin-like protein (GLP) gene, 
complete cds. 


312 


46 


33 


gi8099669 


Homo sapiens 


golgin-like protem (GLP) mRNA, 
complete cds. 


312 


AC 

46 


33 


gil 1037008 


Human 
herpesvirus 8 


latent nuclear antigen 


245 


40 


34 


gi437985 


Cards 

TQiniiianc 

imiumi m 


Rabl2 protein 


1071 


99 


34 


gi206531 


Rattus 
norvegicus 


RAB12 


995 


96 


34 


gil2851149 


Mus musculus 


putative 


819 


96 


35 


gil3543689 


Homo sapiens 


Similar to RIKEN cDNA 4933405K01 
gene, clone MGC: 14799 
IMAGE:4068454, mRNA, complete 
cds. 


1077 

i 


96 


35 


gil2805373 


Mus musculus 


Unknown (protem for MGC:7298) 


950 


84 


35 


Kil2855529 


Mus musculus 


putative 


642 


79 


36 


gil2697979 


Homo sapiens 


mRNA for KIAA1717 protein, partial 
cds. 


1982 


100 


36 


gil651678 


Synechocystis 
sp. PCC 6803 


ORF_ID:slrl485~hypometical protein 


185 


34 
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36 


gi2739367 


Arabidopsis 

fhaliana 


putative phosphatidyHnositol-4- 
phosphate 5-kinase 


153 


28 


37 


gi3800892 


Homo sapiens 


neurexin Hi-alpha gene, partial cds. 


1255 


99 


37 


gi294602 


Rattus 
norvegicus 


neurexin Hi-alpha 


1160 


91 


37 


gi205716 


Rattus 
norvegicus 


neurexin D-aJpha-a 


561 


50 


38 


gil0047315 


Homo sapiens 


mRNA for KIAA1619 protein, partial 
cds. 


4447 


99 


38 


gi8217424 


Homo sapiens 


Human DNA sequence from clone 
RP1 1-108L7 on chromosome 10. 
contains part of the gene for a novel 
Insulin-like growth factor binding type 
protein with Kazal-type serine protease 
inhibitor domain, the gene for a novel 
protein similar to rat tricarboxylate 
carrier, the gene for a novel PDZ 
(DHR, GLGF) domain protein, die 
gene for a novel protein similar to 
KIAA0552, KIAA0341 and Fugu 
hypothetical protein 2, the gene for a 
novel protein similar to Plasmodium 
POM1 and C. elegans F46G1 1 .1, a 
putative novel gene, the SEMA4G gene 
for semaphorin 4G and a novel gene. 
Contains ESTs, STSs, GSSs and seven 
putative CpG islands, complete 
sequence. 


4407 


99 


38 


gi4836757 


Mus musculus 


semaphorin subclass 4 member G 


4021 


90 




gllU43ooo4 


Homo sapiens 


cDNA: FLJ22324 fis, clone 
HRC05551. 


307 


100 


39 


gil3559240 


Homo sapiens 


Human DNA sequence from clone 
RP5-842G6 on chromosome 20. 
Contams the 3 end of a novel gene, the 
3 f end of the gene for a novel protein 

j-iI.llT1.__ +_v PT?T IT / 1 _ /_ r* 

suriilar to SEL1JL (sel-1 (suppressor of 
lin-12, C.elegans)-like), ESTs, STSs 
and GSSs, complete sequence. 


307 


100 


39 


gil3543669 


Homo sapiens 


hypothetical protein FU22324, clone 
MGC:14701 IMAGE:4247211, mRNA, 
complete cds. 


307 


100 


40 


gil4595019 


Homo sapiens 


mRNA for keratin 6 irs (KRT6IRS 
gene). 


2615 


99 


40 


gi6092075 


Mus musculus 


type H cytokeratin 


2414 


91 


40 


gil5559584 


Homo sapiens 


Similar to keratin 6A, clone 
MGC:20671 IMAGE:3639270, mRNA, 
complete cds. 


1468 


57 


41 


gil2655452 


Homo sapiens 


mRNA for keratin associated protein 
4.7 (KRTAP4.7 Rene). 


1157 


86 


41 


gil2655464 


Homo sapiens 


partial mRNA for keratin associated 
protein 4.15 (KRTAP4.15 gene). 


1090 


88 


41 


gil2655462 


Homo sapiens 


mRNA for keratin associated protein 
4.14 (KRTAP4.14 gene). 


1063 


84 
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42 


gi553772 


Homo sapiens 


Human Tcr-C-delta gene, exons 1-4; 
Tcr-V^delta gene, exons 1-2; T-cell 
receptor alpha (Tcr-alpha) gene, J1-J61 
segments; and Tcr-C-alpha gene, exons 
1-4. 


110 


100 


42 


gi4379087 


Homo sapiens 


mRNA for TCR alpha variable region, 
patient AF31. 


73 


46 


42 


AAW40057 


Homo sapiens 


Cellular transcriptional factor p300. 


71 


42 


43 


gil5866589 


Capsella 
rubella 


hypothetical protein 


97 


30 


43 


gi3879045 


Caenorhabditis 
elegans 


R102.6 


96 


34 


43 


AAY56133 


Homo sapiens 


Human N-methyl-D-aspartate receptor 
2 subunit SEQ ID NO:54. 


94 


52 


44 


gi!3569345 


Homo sapiens 


pregnancy-associated plasma 
preproprotein-A2 mRNA, complete 
cds. 


9839 


99 


44 


gi 10639043 


Homo sapiens 


mRNA for pregnancy-associated 
, plasma protein-E (PAPPE gene). 


8966 


99 


44 


gil 142970 


Homo sapiens 


Human pregnancy-associated plasma 
protein-A preproform (PAPPA) 
mRNA, complete cds. 


3856 


45 


45 


gil2851017 


Mus mus cuius 


putative 


578 


83 


45 


gi4490653 


Schizosacchar 
omyces poinbe 


profilin. 


186 


35 


45 


gi440266 


Acanthamoeba 
castellanii 


profilinl 


166 


34 


46 


gil617480 


Comamonas 
testosteroni 


unknown 


712 


82 


46 


gi3046394 


Ralstonia 
eutropha 


phbF 


563 


66 


46 


gi6683782 


Burkholderia 
sp. DSMZ 
9242 


unknown 


560 


61 


47 


gi9229934 


Mus museums 


midnolin 


2103 


78 


47 


AAB56832 


Homo sapiens 


Human prostate cancer antigen protein 
sequence SEQ ID NO: 14 10. 


912 


71 


47 


gil5929300 


Homo sapiens 


Similar to midnolin, clone 
IMAGE:3958934, mRNA, partial cds. 


907 


100 


48 


gil3377624 


Homo sapiens 


calicin mRNA, complete cds. 


3089 


99 


48 


gi854100 


Homo sapiens 


H.sapiens mRNA for calicin (partial). 


3076 


99 


48 


gi853784 


Bos taurus 


calicin 


2896 


91 


49 


AAB68411 


Homo sapiens 


Amino acid sequence of a human 
NOV2 polypeptide. 


2131 


100 


49 


AAY99407 


Homo sapiens 


Human PR01337 (UNQ692) amino 
acid sequence SEQ ID NO:236. 


2101 


99 


49 


AAB68414 


Homo sapiens 


Amino acid sequence of NOV2 
polypeptide clone TA-cgAL132708 A. 


2014 


99 


50 


gil2082748 


Mus musculus 


T-box transcription factor TBX18 


2972 


93 


50 


gi5102617 


Homo sapiens 


Human DNA sequence from clone 
33L1 on chromosome 6ql4.1-15. 
Contains the gene for novel T-box 
(Brachyury) family protein. Contains 


2634 


100 
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ESTs, STSs, GSSs and two putative 
CpG islands, complete sequence. 






50 


gil2849661 


Mus muscuhis 


putative 


2223 


96 


51 


gil2843048 


Mus musculus 


putative 


339 


72 


51 


gi6691626 


Homo sapiens 


RAGE mRNA for advanced glycation 
endproducts receptor, complete cds. 


111 


32 


51 


gil90846 


Homo sapiens 


Human receptor for advanced 
glycosylation end products (RAGE) 
mRNA, partial cds. 


111 


32 


52 


AAG71840 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQID NO: 1521. 


1313 


85 


52 


AAG71839 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQID NO: 1520. 


1226 


81 


52 


AAG71837 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQID NO: 1518. 


1159 


77 


53 


AAB94026 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14163. 


966 


98 


53 


gi!0433955 


Homo sapiens 


cDNA FIJI 2457 fis, clone 
NT2RM1000666, weakly similar to 
DNA-BINDING PROTEIN A. 


966 


98 


53 


gi7295442 


Drosophila 
melanogaster 


CG17334 gene product 


302 


47 


54 


gi8980396 


Homo sapiens 


mRNA for T-cell antigen receptor- 
alpha, clone Pil-la, partial. 


566 


97 


54 


gi2358063 


Homo sapiens 


T-cell receptor alpha delta locus from 
bases 752679 to 1000555 (section 4 of 
5) of the Complete Nucleotide 
Sequence. 


565 


100 


54 


gi623149 


Macaca 
mulatta 


T-cell receptor alpha 


512 


85 


55 


gi2792496 


Rattus 
norvegicus 


tulip 2 


2437 


86 


55 


gi4884288 


Homo sapiens 


mRNA; cDNA DKFZp566D133 (from 
clone DKFZp566D133); partial cds. 


1983 


99 | 


55 


AAB41763 


Homo sapiens 


Human ORFX ORF1527 polypeptide 
sequence SEQ ID NO:3054. 


1976 


98 


56 


gil5524592 


Homo sapiens 


unnamed protein product 


1033 


52 


56 


gi537514 


Homo sapiens 


Human arylacetamide deacetylase 
mRNA, complete cds. 


1033 


52 


56 


AAB54079 


Homo sapiens 


Human pancreatic cancer antigen 
protein sequence SEQ ID NO:53 1 . 


1017 


51 


57 


AAB33831 


Homo sapiens 


Human secreted protein BLAST search 
protein SEQ ID NO: 175. 


149 


35 


57 


gil 109682 


Bos taums 


G-protein gamma- 12 subunit 


149 


35 


57 


AAW09416 


Homo sapiens 


Human G protein gamma-7 subunit. 


144 


33 


58 


gil 2082750 


Mus musculus 


T-box transcription factor TBX20 


1469 


93 


58 


gi9909810 


Mus musculus 


T-box transcription factor 


1469 


93 


58 


gi7229717 


Danio rerio 


H15-related T-box transcription factor 
hrT 


1346 


85 


59 


gi4185946 


Human 
endogenous 
retrovirus K 


gag protein 


146 


26 


59 


gi5802821 | Homo sapiens 


endogenous retrovirus HERV-K108, 


146 


26 
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complete sequence. 






59 


gi5802814 


Homo sapiens 


endogenous retrovirus HERV-K103, 
complete sequence. 


146 


26 


60 


AAB94756 


Homo sapiens 


Human protein sequence SEQ ID 
NO:15815. 


126 


42 


60 


gi332612 


Gibbon ape 
leukemia virus 


pol polyprotein 


113 


50 


60 


gi3 133302 


Sus scrofa 


pol protein 


110 


53 


61 


gil0121625 


Gillichthys 
mirabilis 


60S acidic ribosomal protein PI 


127 


81 


61 


AAB44012 


Homo sapiens 


Human cancer associated protein 
sequence SEQ ID NO: 1457. 


125 


78 


61 


AAB43434 . 


Homo sapiens 


Human cancer associated protein 
sequence SEQ ID NO:879. 


125 


78 


62 


AAB 12585 


Homo sapiens 


Human T cell activating protein SEQ 
IDNO:4. 


140 


37 


62 


gil2805221 


Mus mus cuius 


lymphocyte antigen 6 complex 


140 


37 


62 


gil98924 


Mus mus cuius 


Ly-6A.2 


140 


37 




ei6969165 


Homo sat>iens 


Human DNA sequence from clone 
RP3-475N16 on chromosome 6pl2.3- . 
21.2. Contains the genes for CTG4A, 
pre-T cell receptor alpha, a novel 
protein similar to RPL7A (60S 
ribosomal protein L7A) and the 3* end 
of gene KIAA0240. Contains ESTs, 
STSs, GSSs and four putative CpG 
islands, complete sequence. 


573 


67 


63 


gil2841727 


Mus musculus 


putative 


512 


59 


63 


gil5293877 


Ictalurus 
punctatus 


ribosomal protein L7 


314 


38 


64 


gil81573 


Homo sapiens 


Human cytokeratin 8 (CK8) gene, 
complete cds. 


1147 


79 


64 


gil81400 


Homo sapiens 


Human cytokeratin 8 mRNA, complete 
cds. 


1147 


78 


64 


gi400416 


Homo sapiens 


H.sapiens KRT8 mRNA for keratin 8. 


1147 


79 


65 


gil3620887 


Mus musculus 


mitochondrial ribosomal protein S6 


633 


100 


65 


gil3620885 


Homo sapiens 


MRPS6 mRNA for mitochondrial 
ribosomal protein S6, partial cds. 


565 


85 


65 


gil4603226 


Homo sapiens 


clone MGC: 19576 LMAGE:4304420, 
mRNA, complete cds. 


565 


85 


66 


gil3537119 


Homo sapiens 


mRNA for PAR-6 gamma, complete 
cds. 


1956 


100 


66 


gi8037909 


Mus musculus 


PAR6A 


1490 


76 


66 


gi9453884 


Homo sapiens 


mRNA for 16-5-5, partial cds. 


1304 


93 


67 


AAB95293 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17517. 


776 


79 


67 


AAG81270 


Homo sapiens 


Human AFP protein sequence SEQ ID 
NO:58. 


776 


79 


67 


gil4035848 


Homo sapiens 


unnamed protein product 


776 


79 


68 


gi7020759 


Homo sapiens 


cDNA FLJ20565 fis, clone REC00542. 


930 


60 


68 


gil5216181 


Homo sapiens 


mRNA for putative 67-1 1-3 protein. 


927 ' 


60 


68 


gi 15930069 


Homo sapiens 


Similar to hypothetical protein 
FLJ20565, clone MGC:8850 


917 


60 
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IMAGE:3914396, mRNA, complete 
cds. 






69 


gi3228237 


Homo sapiens 


UHS KerB gene. 


810 


72 


69 


gi200962 


Mus musculus 


serine 1 ultra high sulfur protein 


755 


69 


69 


gi32472 


Homo sapiens 


H.sapiens mRNA for high-sulphur 
keratin. 


749 


71 


70 


AAB92789 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 11284. 


3518 


100 


70 


ei7022420 


Homo sapiens 


cDNA FU10407 fis, clone 
NT2RM4000520. 


3518 


100 


70 


eil31 11786 


Homo ^aniens 


hvDothetical Drotein FLJ 10407. clone 
MGC:970 IMAGE:3 509727, mRNA, 
complete cds. 


3511 


99 


71 


ei 13325 178 


Homo sad i ens 


Similar to RJKEN cDNA 2210016F16 
gene, clone MGC: 10999 
IMAGE:3638524, mRNA, complete 
cds. 


856 


100 


71 


gi7291278 


Drosophila 
melanogaster 


CG9752 gene product 


744 


43 


71 


gi2854153 


Caenorhabditis 
elegans 


Hypothetical protein C11D2.4 


729 


45 


72 


gi7020991 


Homo sapiens 


cDNA FLT20718 fis, clone HEP17872. 


3013 


100 


72 


gil5680144, 


Homo sapiens 


hypothetical protein FLJ20718, clone 
IMAGE:4577269, mRNA, partial cds. 


2906 


99 


72 


gil0801646 


Macaca 
fascicularis 


hypothetical protein 


1097 


99 


73 


AAG93290 


Homo sapiens 


Human protein HP 10650. 


1215 


100 


73 


gil4587195 


Homo sapiens 


FAPP1 -associated protein 1 (FASP1) 
mRNA, complete cds. 


1215 


100 


73 


gi8 118225 


Homo sapiens 


chromosome 21 unknown mRNA, 


1215 


100 


74 


gil0436998 


Homo sapiens 


cDNA: FLJ21011 fis, clone 
CAE04289. 


2522 


100 


74 


gil5030282 


Homo sapiens 


clone MGC:16827 IMAGE:3855873, 
mRNA, complete cds. 


2522 


100 


74 


gi8570641 


Homo sapiens 


clone 133K02 unknown mRNA. 


2514 


99 


75 


gi6599255 ! 


Homo sapiens 


mRNA; cDNA DKFZp434C0328 
(from clone DKFZp434C0328). 


1612 


100 


75 


gi6330416 


Homo sapiens 


mRNA for KIAA1201 protein, partial 
cds. 


554 


38 


75 


AAB74726 


Homo sapiens 


Human membrane associated protein 
MEMAP-32. 


496 


35 


76 


gi7021059 


Homo sapiens 


cDNA FU20758 fis, clone HEP01508. 


1450 


100 


76 


AAW88552 


Homo sapiens 


Secreted protein encoded by gene 19 
clone HSAVU34. 


1429 


100 


76 


gil5341707 


Homo sapiens 


clone MGC:19979 IMAGE:3939273, 
mRNA, complete cds. 


1429 


100 


77 


AAB95410 


Homo sapiens 


Human protein sequence SEQ ID 
N0.17796. 


774 


100 


77 


gil0435394 


Homo sapiens 


cDNA FLJ 13391 fis, clone 
PLACE1001241. 


774 


100 


77 


gil0503974 


Homo sapiens 


clone SP24 unknown mRNA. 


765 


99 


78 


gi7020587 


Homo sapiens 


cDNA FU20467 fis, clone KAT06638. 


737 


100 
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78 


AAB42883 


Homo sapiens 


Human ORFX ORF2647 polypeptide 
sequence SEQ ID NO:5294. 


530 


100 


78 


AAB56642 


Homo sapiens 


Human prostate cancer antigen protein 
sequence SEQ ID NO: 1220. 


530 


100 


79 


AAW93948 


Homo sapiens 


Human regulatory molecule HRM-4 
protein. 


441 


91 


79 


gil2852696 


Mus musculus 


putative 


386 


47 


79 


gil2751103 


Homo sapiens 


PNAS-129 mRNA, complete cds. 


348 


100 


80 


gi7243053 


Homo sapiens 


mRNA for KIAA 1336 protein, partial 
cds. 


3851 


99 


80 


gi7292144 


Drosophila 
melanogaster 


CG2069 gene product 


1634 


44 


80 


gil065457 


Caenorhabditis 
elegans 


C54G7.4 gene product 


706 


25 


81 


gil0439581 


Homo sapiens 


cDNA: FLJ23023 fis, clone 
LNG01678. 


652 


100 


81 


gi7021132 


Homo sapiens 


cDNAFLJ20813fis, clone 
ADSE01247. 


652 


100 


81 


AAG74674 


Homo sapiens 


Human colon cancer antigen protein 
SEQ ID NO:5438. 


556 


92 


R? 
o& 


*H526261 1 


Hnmo saniens 


mRNA- cDNA DKFZd434I1 14 ffrom 

HUVi lily W*Vlli> M—J A »— * M—/^J T A. M. AT ^11 VAU 

clone DKFZp434Il 14); complete cds. 


838 


100 




oil 1493368 


Hrnno saniens 


Human DNA sea ue nee from clone 
RP5-1009E24 on chromosome 20 
Contains the SN gene encoding 
sialoadhesin, a novel gene similar to 
KIAA0417, the CENPB gene for 
centromere protein B, the CDC25B 
gene for Cell division cycle protein 
25B, three novel genes, the 5 f end of 
gene KIAA1271, nine CpG islands, 
ESTs, STSs and GSSs, complete 
sequence. 


838 


100 


82 


gil3543798 


Mus musculus 


RKEN cDNA 4931426K16 gene 


680 


92 


83 


AAB57003 


Homo sapiens 


Human prostate cancer antigen protein 
sequence SEQ ID NO:158L 


1302 


99 


83 


AAR60558 


Homo sapiens 


Human basigin I. 


1302 


99 


83 


gi3492872 


Homo sapiens 


chromosome 19, cosmid F18382 
(LLNLF-140D2) and 3' overlapping 
restriction fragment, complete 
sequence. 


1302 


99 


84 


gi9187614 


Homo sapiens 


mRNA full length insert cDNA clone 
EUROIMAGE 1759349. ! 


580 


100 


84 


AAB01394 


Homo sapiens 


Neuron-associated protein. 


70 


39 


84 


AAB54358 


Homo sapiens 


Human pancreatic cancer antigen 
protein sequence SEQ ID NO:810. 


70 


39 


85 


gil5986445 


Homo sapiens 


p90 autoantigen mRNA, complete cds. 


4513 


99 


85 


gi7959315 


Homo sapiens 


mRNA for KIAA1524 protein, partial 
cds. 


4357 


99 


85 


AAB95207 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17311. 


2341 


100 


86 


gi7959231 


Homo sapiens 


mRNA for KIAA1485 protein, partial 
cds. 


5813 


99 
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86 


AAB40418 


Homo sapiens 


Human ORFX ORF182 polypeptide 
sequence SEQ ID NO:364. 


708 


99 


86 


gi5901529 


Homo sapiens 


C2H2 type Kruppel-like zinc fmger 
protein splice variant b (ZNF236) 
mRNA, complete cds. 


520 


24 


87 


gi7243270 


Homo sapiens 


mRNA for KIAA1436 protein, partial 
cds. 


4604 


99 


87 


gi5051974 


Mus musculus 


F2 alpha prostoglandin regulatory 
protein 


4195 


89 


87 


gil054884 


Rattus 
norvegicus 


prostaglandin F2a receptor regulatory 
protein precursor 


4191 


88 


88 


gi!3241286 


Mus musculus 


GABA(A) receptor-associated protein- 
like 2 


607 


100 


88 


gi2104570 


Rattus 
norvegicus 


GEF-2 


607 


100 


88 


gi4433387 


Bos taurus 


general protein transport factor pi 6 


607 


100 


89 


gil5859535 


Homo sapiens 


unnamed protein product 


5935 


99 


89 


gi3043606 


Homo sapiens 


mRNA for KIAA0541 protein, partial 
cds. 


5890 


100 


89 


gil5624075 


Homo sapiens 


TGF-beta resistance-associated protein 
TRAG (TRAG) mRNA, partial cds. 


5719 


96 


90 


gi337370 


Homo sapiens 


Human rap amy c in- and FK506-binding 
protein, complete cds. 


740 


100 


90 


gil3097252 


Homo sapiens 


Similar to FK506 binding protein 2 (13 
kDa), clone MGC:5 177 
IMAGE:3445148, mRNA, complete 
cds. 


740 


100 


90 


AAQ31004 aa 
1 


Homo sapiens 


hRFKBP cDNA. 


735 


99 


91 


gil2053147 


Homo sapiens 


mRNA; cDNA DKFZp434F1726 (from 
clone DKFZp434F1726). 


1450 


100 


91 


gi412195 


Homo sapiens 


unknown 


265 


98 


91 


AAR04931 


Homo sapiens 


Interferon-gamma receptor segment 
from clone 39 responsiblefor binding 
the target 


260 


96 


92 


gil0437948 


Homo sapiens 


cDNA: FLJ21783 fis, clone HEP00284. 


3276 


100 


92 


AAB95352 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17643. 


1953 


99 


92 


gil0435077 


Homo sapiens 


cDNA FU13171 fis, clone 
NT2RP3003819. 


1953 


99 


93 


gil2803319. 


Homo sapiens 


clone MGC:3090 MAGE:3347913, 
mRNA, complete cds. 


4837 


99 


93 


gil4044064 


Homo sapiens 


hypothetical protein DKFZp762Ml 15, 
clone MGC:14418 IMAGE:4302613, 
mRNA, complete cds. 


4831 


99 


93 


gil0047337 


Homo sapiens 


mRNA for KIAA1630 protein, partial 
cds. 


4671 


100 


94 


AAB70535 


Homo sapiens 


Human PR05 protein sequence SEQ 
ID NO: 10. 


2979 


100 


94 


gil3185719 


Homo sapiens 


unnamed protein product 


2979 


100 


94 


AAB94106 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 14334. 


2334 


100 


95 


gil2837873 


Mus musculus 


putative 


2370 


75 
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95 


gi!3195574 


Mus musculus 


Prajal isoforma 


2339 


75 


95 


AAB93847 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 13691. 


1941 


99 


96 


gi2224543 


Homo sapiens 


Human mRNA for KIAA0301 gene, 
partial cds. 


10626 


100 




gi7529572 


Homo sapiens 


Human DNA sequence from clone 
RP1-12208 on chromosome 6ql4.2- 
16.1. Contains the 3 r part of a novel 
gene partially coded for by KIAA0301, 
a novel gene and the 3 1 part of the gene 
KIAA0957. Contains ESTs, STSs, 
GSSs and a putative CpG island, 
complete sequence. 


10626 


100 


96 


gil0727627 


Drosopbila 
melanogaster 


CGI 3 185 gene product 


1452 


34 


07 


An-DoZJlu 


Hnmn cnnipnc 
nuuiu aajJlCUa 


Human iTnmiinriO'lfiTiiiIiTi rp/rpnfor 
uuuiaii mill iu i m ^iuu miii icvvpwi 

IRTA5 protein. 


2235 


100 


07 

y / 


cn1 SS?RR^1 


nuiuu aaLJicxio 


Vc Tfifipntnr-like nrotein 1 fFCRHl^ 

X \j 1 wvvUUJl LllVrl^XU X ^X V^XVtXX J 

mRNA, complete cds. 


2235 


100 


07 

y i 




IlnmA cflnipnc 
XXUIiAs aa|/lCUa 


Human DMA spoiifnr.** from clone 

lllll 1 loll -L-/1 > A dvl|IXvUVV IXU1X1 vIUUv 

RP11-367J7 on chromosome 1. 
Contains (part of) two or more genes 
for novel Immunoglobulin domains 
containing proteins, a SON DNA 
binding protein (SON) pseudogene, a 
voltage-dependent anion channel 1 
(VDAC1) (plasrnalernmal porin) 
pseudogene, ESTs, STSs and GSSs, 
complete sequence. 


1533 


100 




AAB82318 


Homo samens 


Human immunoglobulin receptor 
IRTA5 protein. 


2177 


98 


98 


ml5528831 


Homo saoiens 


Fc receptor-like protein 1 (FCRH1) 
mRNA, complete cds. 


2177 


98 


98 


ei9930921 


Homo saoiens 


Human DNA sequence from clone 
RP1 1-367J7 on chromosome 1 . 
Contains (part of) two or more genes 
for novel Immunoglobulin dornains 
containing proteins, a SON DNA 
binding protein (SON) pseudogene, a 
voltage-dependent anion channel 1 
(VDAC1) (plasrnalernmal porin) 
pseudogene, ESTs, STSs and GSSs, 
complete sequence. 


1533 


100 


99 


gil0438861 


Homo sapiens 


cDNA: FU22461 fis, clone 
HRC10107. 


4904 


100 


99 


gil5079400 


Homo sapiens 


clone MGC: 16796 IMAGE:3855477, 
mRNA, complete cds. 


4899 


99 


99 


AAU03497 


Homo sapiens 


Human sterol sensing domain protein. 


4047 


99 


100 


gi6524024 


Mus musculus 


mammalian inositol hexakisphosphate 
kinase 1 


1031 


50 


100 


gil0280996 


Rattus 
norvegicus 


inositol hexakisphosphate kinase 


1027 


49 


100 


gi6683115 


Homo sapiens 


mRNA for KIAA0263 protein, partial 


1021 


49 
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cds. 






101 


gi6524024 


Mus musculus 


mammalian inositol hexakisphospbate 
kinase 1 


1037 


51 


101 


gil0280996 


Rattus 
norvegicus 


inositol hexakisphosphate kinase 


1033 


50 


101 


gi6683115 


Homo sapiens 


mRNA for KIAA0263 protein, partial 
cds. 


1027 


50 


102 


gil3623311 


Homo sapiens 


clone IMAGE:3948563, mRNA, 
partial cds. 


1629 


100 


102 


gi3 135968 


Homo sapiens 


Human DNA sequence from clone 
XXbac-3418 on chromosome 6p213- 
22.1. Contains the 5' end of the 
ZNF184 gene for Kruppej-like zinc 
finger protein 184, a heterogeneous 
nuclear ribonucleoprotein A 1 
(HNRPA1) pseudogene, a CD83 
antigen pseudogene, ESTs, STSs, GSSs 
and three CpG islands, complete 
sequence. 


1627 


47 


102 


gil769491 


Homo sapiens 


Human kruppel-related zinc finger 
protein (ZNF184) mRNA, partial cds. 


1625 


47 


103 


gil6198398 


Homo sapiens 


clone MGC:27353 IMAGE:4671816, 
mRNA, complete cds. 


2606 


85 


103 


gi829151 


Homo sapiens 


H.sapiens ZNF37A mRNA for zinc 
finger protein. 


1371 


99 


103 


gi9801232 


Homo sapiens 


Human DNA sequence from clone 
RP1 1-508N22 on chromosome 1 0 
Contains part of a novel gene 
(HSPC025), part of the ZNF37A (zinc 
finger protein 37a (KOX 21)) gene, 
part of a putative novel gene, ESTs, 
STSs, GSSs and a CpG Island, 
complete sequence. 


1371 


99 


104 


gil2053123 


Homo sapiens 


mRNA; cDNA DKF2p434K142 1 
(from clone DKFZp434K1421); 
complete cds. 


2624 


100 


104 


gi7292866 


Drosophila 
melanogaster 


CG15747 gene product 


362 


31 


104 


gi7549210 


Babesia 
bigemina 


200 kDa antigen p200 


298 


21 


105 


gil2053123 


Homo sapiens 


mRNA; cDNA DKFZp434K1421 
(from clone DKFZp434K1421); 
complete cds. 


2898 


100 


105 


gi6841130 


Homo sapiens 


HSPC095 mRNA, partial cds. 


419 


100 


105 


gi7292866 


Drosophila 
melanogaster 


CG15747 gene product 


364 


30 


106 


gil0438207 


Homo sapiens 


cDNA: FLJ21977 fis, clone HEP05976. 


1978 


99 


106 


gil5012167 


Homo sapiens 


hypothetical protein FLJ21977, clone 
MGC:14918 IMAGE:3936410, mRNA, 
complete cds. 


1974 


99 


106 


AAB42499 


Homo sapiens 


Human ORFX ORF2263 polypeptide 
sequence SEQ ID NO:4526. 


1392 


100 


107 


gil228035 


Homo sapiens 


Human mRNA for KIAA0191 gene, 


8020 


99 
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partial cds. 






107 


gil2697967 


Homo sapiens 


mRNA for KIAA 1711 protein, partial 
cds. 


1593 


58 


107 


AAB94636 


Homo sapiens 


Human protein sequence SEQ ID 
NO:15515. 


1004 


52 


108 


AAG81252 


Homo sapiens 


Human AFP protein sequence SEQ ID 
NO:22. 


2146 


99 


108 


gil4035812 


Homo sapiens 


unnamed protein product 


2146 


99 


108 


gil0440123 


Homo sapiens 


cDNA: FU23436 fis, clone 
HRC12692. 


2054 


100 


109 


gi200009 


Mus musculus 


myosin I 


5386 


96 


109 


gil 666471 


Mus musculus 


myosin I heavy chain 


5360 


94 


109 


gi56733 


Rattus 
norvegicus 


myosin I heavy chain 


5268 


91 


110 


gil2053045 


Homo sapiens 


mRNA; cDNA DKFZp434Kl 115 
(from clone DKFZp434Kl 1 15); 
complete cds. 


4840 


100 


110 


AAB65631 


Homo sapiens 


Novel protein kinase, SEQ ID NO: 158. 


4835 


99 


110 


gil4133215 


Homo sapiens 


mRNA forKIAA0781 protein, partial 
cds. 


4678 


100 


111 


gil2642596 


Homo sapiens 


nuclear receptor co-repressor/HDAC3 
complex subunit TBLR1 (TBLR1) 
mRNA, complete cds. 


2725 


100 


111 


AAB95225 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17352. 


2720 


99 


111 


gil0434648 


Homo sapiens 


cDNA FIJI 2894 fis, clone 
NT2RP2004170, moderately similar to 
Homo sapiens mRNA for transducin 
<beta) like 1 protein. 


2720 


99 


112 


gi2224557 


Homo sapiens 


Human mRNA for KIAA0308 gene, 
partial cds. 


6666 


99 


112 


AAY23330 


Homo sapiens 


Human tumour suppressor (kismet) 
protein. 


5759 


98 


112 


gi7243213 


Homo sapiens 


mRNA for KIAA1416 protein, partial 
cds. 


5264 


59 


113 


gil2856019 


Mus musculus 


putative 


1527 


95 


113 


gi3947604 


Caenorhabditis 
elegans 


cDNA EST ykl29fl.3 comes from this 
gene-cDNA EST ykl29fl.5 comes 
from this gene~cDNA EST yk203e4.3 
comes from this gene-cDNA EST 
ykl91a9.3 comes from this 
gene-cDNA EST yk262cl0.3 comes 
from this gene-cDNA EST yk278f9.3 
comes from this gene-cDNA EST 
yk325c7.3 comes from this 
gene~cDNA EST yk337fl .3 comes 
from this gene-cDNA EST yk449a23 
comes from this gene-cDNA EST 
yk203e4.5 comes from this 
gene-cDNA EST ykl91a9.5 comes 
from this gene-cDNA EST yk278£9.5 
comes from this gene-cDNA EST 
yk262cl0.5 comes from this 


787 


41 
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gene-cDNA EST yk325c7.5 comes 
from this gene-cDNA EST yk337fl .5 
comes from this gene-cDNA EST 
yk448gl0.5 comes from this 
gene-cDNA EST yk449a2.5 comes 
from this gene-cDNA EST yk636e23 
comes from this gene-cDNA EST 
yk636e2.5 comes from this 
gene-cDNA EST yk550e8.3 comes 
from this gene-cDNA EST yk557a9.3 
comes from this gene-cDNA EST 
yk579cl2.3 comes from this 
gene-cDNA EST yk614e7.3 comes 
from this gene-cDNA EST yk653fl.3 
comes from this gene-cDNA EST 
yk672b2.3 comes from this 
gene-cDNA EST yk550e8.5 conies 
from this gene-cDNA EST yk556bl .5 
comes from this gene-cDNA EST 
yk557a9.5 comes from this 
gene-cDNA EST yk579cl2.5 comes 
from this gene-cDNA EST yk606c8.5 
comes from this gene~cDNA EST 
yk614e7.5 comes from this gene 






113 

i 

i 


gi3947603 


Caenothabditis 
elegans 


cDNA EST ykl67h7.3 comes from this 
gene-cDNA EST ykl67h7.5 comes 
from this gene-cDNA EST yk289g5.3 
comes from this gene~cDNA EST 
yk332h9.3 comes from mis 
gene-cDNA EST yk289g5.5 comes 
from this gene-cDNA EST yk332h9.5 
comes from this gene-cDNA EST 
yk391h4.5 comes from this 
gene-cDNA EST yk653fl.5 comes 
from this gene 


787 


41 


114 


gi9280136 


Macaca 
fascicularis 


unnamed protein product 


3431 


95 


114 


gi4262617 


Caenorhabditis 
elegans 


contains similarity to dual specificity 
phosphatase, catalyitic domain 
(PfamPF00782, Score=16.8, E=7.4e- 
05,N=1) 


470 


35 


114 


gi5706724 


Homo sapiens 


Cdcl4B3 phosphatase mRNA, 
complete cds. 


166 


30 


115 


AAB95254 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17423. 


3114 


99 


115 


gil4042385 


Homo sapiens 


cDNA FU14693 fis, clone 
NT2RP2005360, weakly similar to 
Homo sapiens sentrin/SUMO-specific 
protease (SENP1) mRNA. 


3114 


99 


115 


gil03 14023 


Homo sapiens 


sentrin-specific protease (SENP2) 
mRNA, complete cds. 


3107 


99 


116 


gi4240227 


Homo sapiens 


mRNA for KIAA0869 protein, partial 
cds. 


4417 


98 
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116 


gil3879506 


Mus musculus 


Unknown (protein for 
IMAGE:3963643) 


4063 


89 


116 


AAB93267 


Homo sapiens 


Human protein sequence SEQ ID 
NO:12300. 


1895 


97 


117 


gil3235092 


Homo sapiens 


mRNA for testis specific protein A14 
(TSGA14 gene). 


1957 


100 


117 


gil0438839 


Homo sapiens 


cDNA: FLJ22445 lis, clone 
HRC09438. 


1950 


99 


117 


R il3235344 


Mus musculus 


testis specific protein a 14 


1704 


87 


118 


gi7959279 


Homo sapiens 


mRNA for K1AA1509 protein, partial 
cds. 


6769 


99 


118 


AAB94101 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 14322. 


1871 


99 


118 


gil0434073 


Homo sapiens 


cDNA FLJ12531 fis, clone 
NT2RM4000199. 


1871 


99 


119 


AAM00936 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 412. 


3350 


100 


119 


AAB42828 


Homo sapiens 


Human ORFX ORF2592 polypeptide 
sequence SEQ ID NO:5184. 


2064 


100 


119 

i 


gi9557949 


Homo sapiens 


mRNA for hypothetical protein 
(ORF1), clone 

Telethon(Italy B41)__Strait02270 FL1 
42. 


1931 


100 


120 


AAB11082 


Homo sapiens 


Human secreted protein ZALPHA13 
protein. 


2783 


93 


120 


gil 1230043 


Homo sapiens 


unnamed protein product 


2783 


93 


120 


AAB37988 


Homo sapiens 


Human secreted protein encoded by 
gene 5 clone HDPAS92. 


2747 


93 


121 


gil2852526 


Mus musculus 


putative 


1689 


80 


121 


AAB41765 


Homo sapiens 


Human ORFX ORF1529 polypeptide 
sequence SEQ ID NO:3058. 


1576 


100 


121 


gi4406663 


Homo sapiens 


clone 24945 mRNA sequence, partial 
cds. 


1576 


100 


122 


AAR22958 


Homo sapiens 


Human proteasome component HC5. 


1010 


85 


122 


gi220026 


Homo sapiens 


Human mRNA for proteasome sub unit 
HC5. 


1010 


85 


122 


ei3790135 


Homo sapiens 


Human DNA sequence from clone 
RP1-191N21 on chromosome 6q27. 
Contains a 7 transmembrane receptor ' 
(rhodopsin family) (olfactory receptor 
like) pseudogene, the PDCD2 gene for 
programmed cell death 2 (RP8 
homolog), the TBP gene for TATA box 
binding protein, the gene for 
proteasome subunit HC5, ESTs, STSs 
and GSSs, complete sequence. 


1010 


85 


123 


AAB21027 


Homo sapiens 


Human nucleic acid-binding protein, 
NuABP-31. 


1456 


100 


123 


AAB45146 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 27 SEQ ID NO:87. 


1456 


100 


123 


gi4884258 


Homo sapiens 


mRNA; cDNA DKFZp564O092 (from 
clone DKFZp564O092); partial cds. 


1430 


100 


124 


gil3325436 


Homo sapiens 


Similar to RDCEN cDNA 


1394 


100 
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C330013D18 gene, clone MGC:11226 
IMAGE:3937599, rnRNA, complete 
cds. 






124 


gil3559363 


Homo sapiens 


MRPL9 mRNA for mitochondrial 
ribosomal protein L9 (L9mt), complete 
cds. 


1388 


99 


124 


AAG93251 


Homo sapiens 


Human protein HP02612. 


1153 


86 


125 


AAB85507 


Homo sapiens 


Human protein kinase SGK164. 


2949 


100 


125 


gil3543922 


Homo sapiens 


Similar to RKEN cDNA 5430416A05 
gene, clone MGC: 12903 
IMAGE:3537086, mRNA, complete 
cds. 


2913 


100 


125 


gil2856491 


Mus musculus 


putative 


2135 


79 


126 


gil2653817 


Homo sapiens 


Similar to Male-specific RNA 84Dd, 
cloiie MGC:3092 IMAGE:3349383, 
mRNA, complete cds. 


3399 


100 


126 


AAB94115 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 14356. 


3392 


99 


126 


gil0434102 


Homo sapiens 


cDNA FU12549 fis, clone 
NT2RM4000689. 


3392 


99 


127 


gi7243187 


Homo sapiens 


mRNA for KIAA1403 protein, partial 
cds. 


6448 


98 


127 


gil2652971 


Homo sapiens 


clone MGC:858 IMAGE:33 57380, 
mRNA, complete cds. 


3992 


100 


127 


AAB92872 


Homo sapiens 


Human protein sequence SEQ ID 
NO.11460. 


3987 


99 


128 


AAB94324 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 14807. 


1779 


99 


128 


gil0434528 


Homo sapiens 


cDNAFLJ12816fis, clone 
NT2RP2002609, weakly similar to 2- 
HYDROXYMUCONIC 
SEMIALDEHYDE HYDROLASE (EC 
3.1.1..). 


1779 


99 


128 


AAB42143 


Homo sapiens 


Human ORFX ORF1907 polypeptide 
sequence SEQ ID NO:3814. 


1521 


100 


129 


gi6329945 


Homo sapiens 


mRNA for KIAA1 140 protein, partial 
cds. 


1857 


52 


129 


gil2805043 


Homo sapiens 


clone IMAGE:3461487, mRNA, 
partial cds. 


1279 


54 


129 


gi7302173 


Drosophila 
melanogaster 


BcDNA:LD21719 gene product 


1261 


35 


130 


AAB28199 


Homo sapiens 


Human HMG-17 non histone 
chromosomal protein. 


322 


75 


130 


gi306864 


Homo sapiens 


Human non-histone chromosomal 
protein HMG-17 mRNA, complete cds. 


322 


75 


130 


gi32329 


Homo sapiens 


Human HMG-17 gene for non-histone 
chromosomal protein HMG-17. 


322 


75 


131 


gil6041794 


Homo sapiens 


clone MGC:23591 IMAGE:4856946, 
mRNA, complete cds. 


2714 


99 


131 


gil5559462 


Homo sapiens 


Similar to old astrocyte specifically 
induced substance, clone MGC:20215 
IMAGE:4546950, mRNA, complete 
cds. I 


2709 


99 
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131 


gi4519621 


Mus muscuhis 


OASIS protein 


2406 


91 


132 


gi7573591 


Homo sapiens 


Human DNA sequence from clone 
RP1-309K20 on chromosome 20 
Contains the gene for a novel protein 
similar to dysferlin, the SPAG4 gene 
for sperm associated antigen 4, the 
CPNE1 gene for Copine I (similar to 
KIAA0636), the gene KIAA0765 
(HRIHFB2091) for an RNA 
recognition mouf (RNP, RRM or RBD 
domain) containing protein and the 3* 
end of me NIFS gene for cysteine 
desulfurase. Contains ESTs, STSs, 
GSSs and four putative CpG islands, 
complete sequence. 


4972 


100 


132 


gil5559252 


Homo sapiens 


RNA binding motif protein 12, clone 
MGC:19528 IMAGE:3845090, mRNA, 
complete cds. 


4972 


100 


132 


gil5215375 


Homo sapiens 


RNA binding motif protein 12, clone 
MGC:16487 IMAGE:3 956772, mRNA, 
complete cds. 


4972 


100 


133 


gil2697774 


Mus musculus 


acetyl-CoA synthetase 2 


3181 


87 ; 


133 


gil2697772 


Bos taurus 


acetyl-CoA synthetase 2 


3056 


83 


133 


AAB34712 


Homo sapiens 


Human secreted protein encoded by 
DNA clone vo9 1. 


2721 


100 


134 


gi7020783 


Homo sapiens 


cDNA FU20580 fis, clone REC00516. 


848 


100 


134 


gil5012026 


Homo sapiens 


Similar to hypothetical protein 
FLJ20580, clone MGC:13430 
IMAGE:4093763, mRNA, complete 
cds. 


848 


100 


134 


gil 2833008 


Mus musculus 


putative 


814 


85 


135 


AAB94473 


Homo sapiens 


Human protein sequence SEQ ID 
NO:15139. 


1970 


100 


135 


AAG74880 


Homo sapiens 


Human colon cancer antigen protein 
SEQIDNO:5644. 


1970 


100 


135 


AAB43720 


Homo sapiens 


Human cancer associated protein 
sequence SEQ ID NO:l 165. 


1970 


100 


136 


gil0047285 


Homo sapiens 


mRNA for KIAA1605 protein, partial 
cds. 


3610 


99 


136 


gil6215453 


Homo sapiens 


mRNA for bile acid beta-glucosidase. 


3610 


99 


136 


gil5030210 


Homo sapiens 


KIAA1605 protein, clone MGC:16895 
IMAGE:4339156, mRNA, complete j 
cds. 


3610 


99 


137 


gi49 14601 


Homo sapiens 


mRNA; cDNA DKFZ£564A026 (from 
clone DKF2p564A026). 


4171 


94 


137 


AAB94357 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14881. 


2195 


99 


137 


AAY45161 


Homo sapiens 


Human secreted protein clone 
C0139 3 protein sequence. 


2112 


100 


138 


gi313131 


Torpedo 
marmorata 


alpha-tubulin 


1192 


97 


138 


gil4198110 


Mus musculus 


tubulin alpha 1 


1192 


97 


138 


gil3435777 


Mus museums 


tubulin alpha 6 


1192 


97 
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139 


AAB94856 


Homo sapiens 


Human protein sequence SEQ ID 
NO:16044. 


2138 


100 


139 


AAB94628 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 15490. 


2138 . 


100 


139 


gil0436294 


Homo sapiens 


cDNAFU13970fis, clone 
Y79AA1001533, moderately similar to 
Mouse mRNA for RNA polymerase I 
associated factor (PAF53). 


2138 


100 


140 


gi2642187 


Rattus 
norvegicus 


endo-alpha-D-mannosidase 


1415 


67 


140 


AAB95204 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17303. 


1094 


66 


140 


gi!0434559 


Homo sapiens 


cDNA FIJI 2838 fis, clone 
NT2RP2003230, moderately similar to 
Rattus norvegicus endo-alpha-D- 
mannosidase (Enman) mRNA. 


1094 


66 


141 


gi3449308 


Homo sapiens 


mRNA for MEGF8, partial cds. 


9785 


100 


141 


gi6681364 


Rattus 
noivegicus 


MEGF8 


4772 


95 


141 


gil0728654 


Drosophila 
melanogaster 


CG7466 gene product 


2902 


34 


142 


AAY29517 


Homo sapiens 


Human lung tumour protein SAL-82 
predicted amino acid sequence. 


3048 


100 


142 


gil3958036 


Homo sapiens 


FYVE-finger protein EIP1 mRNA, 
complete cds. 


3048 


100 


142 


AAY29861 


Homo sapiens 


Human secreted protein clone cb98 4. 


3041 


99 


143 


gil47 18539 


Homo sapiens 


HIC-3 mRNA, complete cds. 


3178 


99 


143 


gi5689371 


Homo sapiens 


mRNA for KIAA1020 protein, partial 
cds. 


2970 


99 


143 


gi7328028 


Homo sapiens 


mRNA; cDNA DKFZp434F0616 (from 
clone DKFZp434F0616); partial cds. 


1738 


100 


144 


gil2620400 


Homo sapiens 


mitochondrial carrier protein CGI-69 
long form mRNA, complete cds. 


1856 


99 


144 


AAB42783 


Homo sapiens 


Human ORFX ORF2547 polypeptide 
sequence SEQ ID NO:5094. 


1804 


96 


144 


gil0438783 


Homo sapiens 


cDNA: FLJ22407 fis, clone 
HRC08407. 


1798 


97 


145 


gi2792366 


Homo sapiens 


unknown protein IT12 mRNA, partial 
cds. 


4390 


99 


145 


gil843399 


Homo sapiens 


mRNA, partial cds, clone:RES4-25. 


3676 


99 


145 


gil4602505 


Homo sapiens 


clone IMAGE:3936655, mRNA, 
partial cds. 


2366 


99 


146 


gil3359167 


Homo sapiens 


mRNA for KIAA1646 protein, partial 
cds. 


2581 


99 


146 


AAY96059 


Homo sapiens 


Human sphingosine kinase C. 


2456 


99 


146 


gi6572330 


Homo sapiens 


Human DNA sequence from clone 
59H18 on chromosome 22. Contains 
the 3' part of the gene for KIAA0767, a 
novel gene, ESTs, STSs, GSSs and a 
putative CpG island, complete 
sequence. 


1627 


96 


147 


gil4043303 


Homo sapiens 


exonuclease NEF-sp, clone 
MGC.15944 IMAGE.-3537866, mRNA, 


4043 


100 
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complete cds. 






147 


gil3272524 . 


Homo sapiens 


exonuclease NEF-sp mRNA, complete 
cds. 


4039 


99 


147 


gil2053043 


Homo sapiens 


mRNA; cDNA DKFZp434J0315 (from 
clone DKFZp434J0315); complete cds. 


3843 


95 


148 


gi7243037 


Homo sapiens 


mRNA for KIAA1328 protein, partial 
cds. 


2894 


100 


148 


gil3874541 


Macaca 
fascicularis 


hypothetical protein 


2492 


93 


148 


gil335313 


Homo sapiens 


Human muscle mRNA for embryonic 
myosin heavy chain (SMHCE). 


129 


24 


149 


AAB42399 


Homo sapiens 


Human ORFX ORF2163 polypeptide 
sequence SEQ ID NO:4326. 


1362 


91 


149 


AAB42366 


Homo sapiens 


Human ORFX ORF2 130 polypeptide 
sequence SEQ ID NO:4260. 


626 


100 


149 


gi7298594 


Drosophila 
melanogaster 


CG 10189 gene product 


223 


35 


150 


AAB95372 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17692. 


1538 


99 


150 


gil0435150 


Homo sapiens 


cDNA FU13220 fis, clone 
NT2RP4002047, moderately similar to 
GTP-BINDING PROTEIN LEPA. 


1538 


99 


150 


gil0437720 


Homo sapiens 


cDNA: FLJ21595 fis, clone 
COL07069. 


1438 


100 


151 


gi3327080 


Homo sapiens 


mRNA for KIAA0633 protein, partial 
cds. 


6823 


99 ; 


151 


gi857571 


Mus musculus 


cordon-bleu gene product 


1345 


81 


151 


gi6094680 


Homo sapiens 


PAC clone RP5-1 168M19 from 7pl2- 
qll.21, complete sequence. 


1342 


100 


152 


gi!5451265 


Macaca 
fascicularis 


hypothetical protein 


2728 


98 


152 


AAB41597 


Homo sapiens 


Human ORFX ORF1361 polypeptide 
sequence SEQ ID NO:2722. 


2650 


100 


152 


gi5689443 


Homo sapiens 


mRNA for KIAA1053 protein, partial 
cds. 


2650 


100 


153 


gil4036062 


Homo sapiens 


unnamed protein product 


1930 


100 


153 


AAG81377 


Homo sapiens 


Human AFP protein sequence SEQ ID 
NO:272. 


1925 


99 


153 


gil2833112 


Mus musculus 


putative 


1727 


88 


154 


gi!2832455 


Mus musculus 


putative 


1220 


89 


154 


gil5080314 


Homo sapiens 


Similar to RUCEN cDNA 0610010D20 
gene, clone MGQ20590 
IMAGE:43 10241, mRNA, complete 
cds. 


514 


100 


154 


gi6002488 


Penicillium 
chrysogenum 


hypothetical protein 


338 


31 


155 


gil4017889 


Homo sapiens ' 


mRNA for KIAA1836 protein, partial 
cds. 


2511 


100 | 


155 


AAB94592 


Homo sapiens 


Human protein sequence SEQ ID 
NO:15402. 


972 


50 


155 


gil0435321 


Homo sapiens 


cDNA FIJI 3337 fis, clone 
OVARC1001880. 


972 


50 


156 


#14550510 


Homo sapiens 


pseudouridylate synthase 1, clone 


2123 


100 



152 
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MGC:2736 IMAGE:2822709, mRNA, 
complete cds. 






156 


gil2804097 


Homo sapiens 


Similar to pseudouridine synthase 1, 
clone MGC: 11268 IMAGE:3943243, 
mRNA, complete cds. 


2123 


100 


156 


gi4455035 


Homo sapiens 


pseudouridine synthase 1 (PUS1) 
mRNA, partial cds. 


1927 


99 


157 


AAY58052 


Homo sapiens 


Human protein kinase H2LAU20 
protein sequence. 


3198 


98 


157 




Hnmn <Mnif*ns 


orotein kinase DYRK4 fDYRK4^ 
mRNA, partial cds. 


2844 


100 


157 


AAW71685 


Homo sapiens 


Amino acid sequence of human 
serine/threonine nrotein kinase 


1909 


97 


158 


gl / J\J\JZ7 J X. 


Drosonhila 
xj ivo upmio 

melanogaster 


BcDNA-LD21504 eene Droduct 


971 


62 


158 

I JO 


oi4972728 


T^nriinnfiilji 

melanogaster 


unknown 


971 


62 


I JO 


AAR97646 


Hnmn ^jmipnc 
i x\jili\J da|Jidio 


Rihosomal S3 nrotein 17 


831 


99 


159 

XJZf 




Hnmn cnnipnQ 

lX\JXxJXJ ddUlCUd 


Phnsnhatuse 1 nrotein-like nrotein. 
MEM6 


1514 


100 


159 


gil5551577 


Homo sapiens 


unnamed protein product 


1514 


100 


159 


AAB95633 


XX\JLXX\J oupitllo 


Human nrotein sentience SEO ID 
NO: 18363. 


1510 


99 


160 


eil2804573 


Homo saniens 


Similar to CGI 1334 gene product, 
clone MGC:3207 1MAGE:3501899, 
mRNA, complete cds. 


1859 


100 


160 


gil2851419 


Mus musculus 


putative 


1590 


86 


160 | 


gi7302053 


Drosophila 
melanogaster 


CGI 1334 gene product 


1046 


59 


161 


gil580781 


Homo sapiens 


Human beige-like protein (BGL) 
mRNA, partial cds. 


9734 


99 


161 


gil0180266 


Mus musculus 


LBA 


9333 


86 


161 


gil0257401 


Mus musculus 


LBA isofonnbeta 


8920 


86 


162 


eil5082589 


Homo saniens 


clone MGC:4408 1MAGE:2906200, 
mRNA, complete cds. 


2065 


99 


162 


eil5638615 


Arabidoosis 
thaKana 


HEN1 


350 


37 


162 


cil3241746 


Arahi don si s 

thaliana 


CORYMBOSA2 


350 


37 


163 


gil5291227 


Drosophila 
melanogaster 


GH13O40p 


701 


40 


163 


gi7303780 


Drosophila 
melanogaster 


CG12214 gene product 


701 


40 


163 


AAB95882 


Homo sapiens 


Human protein sequence SEQ ED 
NO:18991. 


501 


100 


164 


gi3327170 


Homo sapiens 


mRNA for KIAA0678 protein, partial 
cds. 


5255 


100 


164 


AAB95304 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17542. 


4431 


99 


164 


gil4134120 


Caenorhabditis 
elegans 


endocytosis protein RME-8 


2127 


42 


165 


AAB53427 


Homo sapiens 


Human colon cancer antigen protein 
sequence SEQ ID NO:967. 


813 


96 
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165 


gil3905098 


Mus musculus 


B-cell translocation gene 1, anti- 
proliferative 


813 


96 


165 


gi293306 


Mus musculus 


B-cell translocation gene-1 protein 


813 


96 


166 


gil3365897 


Macaca 
fascicularis 


hypothetical protein 


2501 


97 


166 


AAY02168 


Homo sapiens 


A facilitative glucose transporter 
protein GLUT8. 


870 


99 


166 


gil3445575 


Homo sapiens 


facilitative glucose transporter 
GLUT10 (SLC2A10) mRNA, complete 
cds. 


835 


39 


167 


gil3365897 


Macaca 
fascicularis 


hypothetical protein 


2173 


97 


167 


AAY02168 


Homo sapiens 


A facilitative glucose transporter 
protein GLUTS. 


870 


99 


167 


gil3445575 


Homo sapiens 


facilitative glucose transporter 
GLUT10 (SLC2A10) mRNA, complete 
cds. 


678 


37 


168 


gil0047251 


Homo sapiens 


mRNA for KIAA 1588 protein, partial 
cds. 


3292 


100 


168 


gil4424704 


Homo sapiens 


clone MGC:15071 IMAGE:4 110510, 
mRNA, complete cds. 


2315 


100 


168 


gi4567179 


Homo sapiens 


chromosome 19, BAC 37295 (CIT-B- 
21A4), complete sequence. 


1269 


43 


169 


gil5558943 


Homo sapiens 


guanylate binding protein 4 mRNA, 
complete cds. 


3134 


99 


169 


gil 174187 


Mus musculus 


purine nucleotide binding protein 


2260 


70 


169 


g i 193444 


Mus musculus 


guanylate binding protein 


1986 


66 


170 


gil4585859 


Homo sapiens 


hypothetical protein SB 138 


1121 


100 


170 


gi6665778 


Mus musculus 


cyclin ania-6b 


1052 


92 


170 


gil2841169 


Mus musculus 


putative 


1052 


92 


171 


AAB64407 


Homo sapiens 


Amino acid sequence of human 
intracellular signalling molecule 
INTRA39. 


3394 


100 


171 


AAB71963 


Homo sapiens 


Human TGF-beta receptor encoded by 
cDNA clone HFIHY04. 


3394 


100 


171 


gil0438113 


Homo sapiens 


cDNA: FU21908 fis, clone HEP03830. 


3385 


99 


172 


gil2652533 


Homo sapiens 


clone MGC:2637 IMAGE:3505128, 
mRNA, complete cds. 


676 


89 


172 


AAB67453 


Homo sapiens 


Amino acid sequence of a human 
chaperone polypeptide. 


668 


88 


172 


gi9758421 


Arabidopsis 
thaliana 


gene_id:MHFI5.7^imilar to unknown 
protein- 


199 


28 


173 


AAB97025 i 


Homo sapiens 


Human colon carcinoma suppressor 
gene-related protein. 


1773 


61 


173 


gi9857318 


Homo sapiens 


Asef mRNA for APC-stimulated 
guanine nucleotide exchange factor, 
complete cds. 


1773 


61 


173 


gi8809845 


Homo sapiens 


chromosome 2q22 RhoGEF mRNA, 
complete cds. 


1700 


61 


174 


gil2052828 


Homo sapiens 


mRNA; cDNA DKFZp564N1062 
(from clone DKF2p564N1062); 
complete cds. 


1601 


99 


174 


gil2850603 


Mus musculus | putative 


1062 


92 
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174 


AAB94655 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 1 5568. 


671 


100 


175 


gil5080282 


Homo sapiens 


Similar to putative sialoglycoprotease 
type 2, clone MGC:20293 
IMAGE:4121450, rnRNA, complete 
cds. 


1747 


99 


175 


gil 107 1727 


Homo sapiens 


mRNA for putative sialoglycoprotease 
type 2. 


1707 


92 


175 


gil2847276 


Mus mil s cuius 


putative 


1541 


84 


176 


AAB36628 


Homo sapiens 


Human FLEXHT-50 protein sequence 
SEQ ID NO:50. 


527 


100 


176 


AAB94208 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 14557. 


527 


100 


176 


AAG01512 


Unmn <5flnienQ 


Human secreted nrntein SEO ID NO* 
5593. 


527 


100 


177 




Hfnmn QflrvipriQ 

x L\jxn\J oauiviu 


Similar to RTKKN cDNA 2810442016 
gene, clone MGC:23197 
iMAGE*4861869 mRNA comnlete 
cds. 


2084 


100 

1 \J\J 


177 


eil 1493 155 


Homo sapiens 


Human DNA senuence from clone 
RP5-852M4 on chromosome 20. 
Contains the gene encoding the HBV 
associated factor, a novel gene similar 
to Drosophilia CGI 7883, a putative 
novel gene, two CpG islands, ESTs, 
GSSs, and STSs, complete sequence. 


1952 


100 


177 


gil2840168 


Mus raus cuius 


putative 


1938 


93 


178 


AAB87034 


Homo sapiens 


Human secreted protein TANGO 339, 
SEQIDNO:3. 


1449 


100 


178 


AAY76266 


Homo sapiens 


Human secreted protein encoded by 
gene 10 fragment 


1449 


100 


178 


AAB87135 


Homo sapiens 


Human secreted protein TANGO 339 
F20Y variant, SEQ ID NO:139. 


1446 


99 


179 


gi434763 


Homo sapiens 


Human mRNA for KIAA0120 gene, 
complete cds. 


1048 


100 


179 


gil4424677 


Homo sapiens 


transgelin 2, clone MGC: 15279 
IMAGE:4301018, mRNA, complete 
cds. 


1048 


100 


179 


gi9956026 


Homo sapiens 


clone CDABP0035 mRNA sequence. 


1048 


100 


180 


AAB31677 


Homo sapiens 


Amino acid sequence of a human 
protein having a hydrophobic domain. 


2803 


100 


180 


AAE03346 


Homo sapiens 


Human gene 19 encoded secreted 
protein HCRNF14, SEQ ID NO:120. 


2803 


100 


180 


AAE03310 


Homo sapiens 


Human gene 19 encoded secreted 
protein HCRNF14, SEQ ID NO:84. 


2803 


100 


181 


AAB41910 


Homo sapiens 


Human ORFX ORF1674 polypeptide 
sequence SEQ ID NO:3348. 


1530 


99 


181 


gi5262467 


Homo sapiens 


mRNA; cDNA DKFZp564I122 (from 
clone DKFZp564I122). 


1530 


99 


181 


gil2849716 


Mus musculus 


putative 


1259 


82 


182 


gi2072972 


Homo sapiens 


Human LI element LI .25 p40 and 
putative pi 50 genes, complete cds. 


497 


53 


182 


AAB64943 


Homo sapiens 


Human secreted protein sequence 


494 


54 
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encoded by gene 7 SEQ ID NO:121. 






182 


gi507O622 


Homo sapiens 


retrotransposon LI insertion in X- 
linked retinitis pigmentosa locus, 
complete sequence. 


494 


53 


m 


AAB59191 


Homo sapiens 


Human NADE. 


217 


47 


183 


gi8452894 


Homo sapiens 


p75NTR-associated cell death executor 
(NADE) mRNA, complete cds. 


217 


47 


183 


gil89379 


Homo sapiens 


Human unknown protein from clone 
pHGR74 mRNA, complete cds. 


217 


47 


184 


AAB88468 


Homo sapiens 


Human membrane or secretory protein 
clone PSEC0263. 


4931 


97 


184 


gil4272788 


Homo sapiens 


unnamed protein product 


4931 


97 


184 


gi577301 


Homo sapiens 


Human mRNA for KIAA0090 gene, 
partial cds. 


4650 


99 


185 


AAG64953 


Homo sapiens 


Human ATP-dependent helicase 
protein 68. 


3169 


100 


185 


gil2052748 


Homo sapiens 


mRNA; cDNA DKFZp564B1023 
(from clone DKFZp564B1023); 
complete cds. 


2716 


100 


185 


gil2836314 


Mus musculus 


putative 


2655 


83 


186 


gil4017781 


Homo sapiens 


mRNA for KIAA1782 protein, partial 
cds. 


2834 


99 


186 


gi4062983 


Mus musculus 


Eos protein 


2747 


95 


186 


gil 1612390 


Homo sapiens 


zinc finger transcription factor Eos 
mRNA, complete cds. 


2603 


98 


187 


AAB95721 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 18592. 


2419 


100 


187 


gil0436538 


Homo sapiens 


cDNAFU14153fis, clone 
NT2RM1 000092, weakly similar to 
MULTIDRUG RESISTANCE 
PROTEIN 2. 


2419 


100 


187 


gil2248763 


Homo sapiens 


mRNA for SMAP-4, complete cds. 


2323 


96 


188 


gil3278906 


Homo sapiens 


clone MGC:4440 IMAGE:2959536, 
mRNA, complete cds. 


1040 


100 


188 


gil3278819 


Homo sapiens 


clone MGC:2776 IMAGE:2959536, 
mRNA, complete cds. 


1040 


100 


188 . 


AAB95829 


Homo sapiens 


Human protein sequence SEQ ID 
NO:18847. 


618 


79 


189 


gil4602977 


Homo sapiens 


Similar to KIAA0789 gene product, 
clone MGC:16602 IMAGE:4 110708, 
mRNA, complete cds. 


3100 


99 


189 


gi3043570 


Homo sapiens 


mRNA forKIAA0523 protein, partial 
cds. 


2564 


100 


189 


gil4133217 


Homo sapiens 


mRNA for KIAA0789 protein, partial 
cds. 


1463 


49 


190 


gi9717245 | 


Mus musculus 


cytoplasmic dynein heavy chain 


5569 


98 


190 


gi294543 


Rattus 
norvegicus 


dynein heavy chain 


5557 


98 


190 


gi402528 


Rattus 
norvegicus 


cytoplasmic dynein heavy chain 


5535 


98 


191 


gil3537204 


Homo sapiens 


mRNA for MAST205, complete cds. 


6834 


98 


191 


gi406058 


Mus musculus 


protein kinase 


6343 


86 


191 


gi3882335 


Homo sapiens 


mRNA for K1AA0807 protein, partial 


6300 


98 
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cds. 






192 


gil2847109 


Mus musculus 


putative 


1356 


79 


192 


gil3623271 


Homo sapiens 


Similar to R1KEN cDNA 2600005P05 
gene, clone MGC:1 1321 
IMAGE:3951804, mRNA, complete 
cds. 


1332 


100 


192 


gil2847837 


Mus musculus 


putative 


1170 


76 


193 


gi38149 


Pongo 
pygmaeus 


epsilon-globin 


397 


100 


193 


gi903731 


Gorilla gorilla 


epsilon-globin 


397 


100 


193 


gi903707 


Pan 

troglodytes 


epsilon-globin 


397 


100 


194 


AAB74695 


Homo sapiens 


Human membrane associated protein 
MEMAP-1. 


1799 


100 


194 


AAE01340 


Homo sapiens 


Human gene 22 encoded secreted 
protein fragment, SEQ ID NO:205. 


1799 


100 


194 


gil5929183 


Homo sapiens 


modulator of apoptosis 1, clone 
MGC:9487 IMAGE:3922055, mRNA, 
complete cds. 


1799 


100 


195 


AAG93260 


Homo sapiens 


Human protein HP10106. 


1769 


100 


195 


gil5029765 


Mus musculus 


RIKEN cDNA 2810039M17 gene 


1650 


91 


195 


gil2849932 


Mus musculus 


putative 


1650 


91 


196 


gil4017843 


Homo sapiens 


mRNA for KIAA1813 protein, partial 
cds. 


3434 


100 


196 


gil5 193290 


Homo sapiens 


LAPSER1 (LAPSER1) mRNA, 
complete cds. 


3309 


100 


196 


gi8217421 


Homo sapiens 


Human DNA sequence from clone 
RP1 1-108L7 on chromosome 10. 
contains part of the gene for a novel 
Insulin-like growth factor binding type 
protein with Kazal-type serine protease 

inhibitor domain thp f»<*ne for a novel 
protein similar to rat tricarboxylate 
carrier the eene for a novel PDZ 
(DHR, GLGF) domain protein, the 
gene for a novel protein similar to 
KIAA0552, KIAA0341 and Fugu 
hypothetical protein 2, the gene for a 
novel protein similar to Plasmodium 
POM1 and C. elegans F46GU.1, a 
putative novel gene, the SEMA4G gene 
for semaphorin 4G and a novel gene. 
Contains ESTs, STSs, GSSs and seven 
putative CpG islands, complete 
sequence. 


3264 


100 


197 


gil458241 


Caenorhabditis 
elegans 


Hypothetical protein B0507.2 


782 . 


39 


197 


gil2832510 


Mus musculus 


putative 


490 


89 


197 


AAB54014 


Homo sapiens 


Human pancreatic cancer antigen 
protein sequence SEQ ID NO:466. 


242 


100 


198 


gi500747 


Mus musculus 


capping protein beta-subunit, isoform 1 


1440 


98 


198 


gi212902 


Gallus gallus 


actin-capping protein Z beta subunit 


1432 


98 


198 


gil2805189 


Mus musculus 


capping protein (actin filament) muscle 


1318 


92 



157 



WO 02/081731 



PCTAJS02/01222 



Table'2 



SEQ ID NO: 


Accession No. 


Species 


Description 


Score 


% 
Identity 








Z-line, beta 






199 


gil4017787 


Homo sapiens 


mRNA for KIAA1785 protein, partial 
cds. 


3195 


100 


199 


gil 3436428 


Homo sapiens 


Similar to feminization 1 a homolog 
(C elegans), clone MGC:4216 
BMAGE:2957950, mRNA, complete 
cds. 


2197 


64 


199 


gi!2836689 


Mus mus cuius 


putative 


2164 


65 


200 


gi7959811 


Homo sapiens 


PR01167 


389 


100 


200 


gi2736345 


Caenorhabditis 
elegans 


contains similarity to G-coupled protein 
receptors 


69 


33 


200 


gi7504953 


Caenorhabditis 
elegans 


hypothetical protein H22D07. 1 - 
Caenorhabditis elegans > 


69 


33 


201 


gil2697975 


Homo sapiens 


mRNA for KIAA1715 protein, partial 
cds. 


2230 


100 


201 


AAB42461 


Homo sapiens 


Human ORFX ORF2225 polypeptide 


1015 


100 


201 


^il2844031 


Mus musculus 


putative 


567 


92 


ZUZ 


, gl/zyOl /O 


urosopmia 
melanogaster 


\^\jjL03y gene proaucc 




27 


JAJZ 




Xiomo sapiens 


rfYNJA- FT 199400 fi« rlrmp 

HRC10983. 


184 


97 
✓ • 


ICY) 




VxoCUUl UaUUIUS> 

elegans 


rTYWA F^T vlr^OlVi? ^ rnmpc from this 

gene-cDNA EST yk523d4.5 comes 
from this eene-cDNA EST vk553f6 5 
comes from this gene-cDNA EST 
yk595gl2.5 comes from mis 
gene-cDNA EST yk606gl0.5 comes 
from this gene-cDNA EST yk652f3.5 
comes from mis gene 


182 


21 


203 


AAM00957 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 433. 


1725 


100 


203 


gi4151807 


Rattus 
norvegicus 


membrane-associated guanylate kinase- 
interacting protein 2 Maguin-2 


1484 


62 


203 


gi4151805 


Rattus 
norvegicus 


membrane-associated guanylate kinase- 
interacting protein 1 Maguin-1 


1484 


62 


204 


AAM00844 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 207. 


1051 


98 


204 


gi4151807 


Rattus 
norvegicus 


membrane-associated guanylate kinase- 
interacting protein 2 Maguin-2 


779 


69 


204 


gi4151805 


Rattus 
norvegicus 


membrane-associated guanylate kinase- 
interacting protein 1 Maguin-1 


779 


69 


205 


AAM00957 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 433. 


1576 


92 


205 


gi4151807 


Rattus 
norvegicus 


membrane-associated guanylate kinase- 
interacting protein 2 Maguin-2 


1349 


57 


205 


gi4151805 


Rattus 
norvegicus 


membrane-associated guanylate kinase- 
interacting protein 1 Maguin-1 


1349 


57 


206 


gi7242969 


Homo sapiens 


mRNA for KIAA1307 protein, partial 
cds. 


8582 


99 


206 


AAM00860 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 223. 


4841 


98 


206 


gi4426611 


Drosophila 


pushover 


2137 


46 
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melanogaster 








207 


AAB62210 


Homo sapiens 


Human ABCA2 transporter protein. 


9835 


99 


207 


gil3173186 


Homo sapiens 


ABC transporter ABCA2 (ABCA2) 
mRNA, complete cds. 


9835 


99 


207 


gi9957467 


Homo sapiens 


ATP-binding cassette sub-family A 
member 2 (ABCA2) mRNA, complete 
cds. 


9835 


99 


208 


AAB94358 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 14883. 


2268 


99 


208 


gil0434632 


Homo sapiens 


cDNA FLJ12886 fis, clone 
NT2RP2004041, weakly similar to 
SYNAPSINS IA AND IB. 


2268 


99 


208 


gil2052738 


Homo sapiens 


mRNA; cDNA DKFZp564H1322 
(from clone DKFZp564H1322); 
complete cds. 


2268 


99 


1AO 


gi 14627 122 


Homo sapiens 


Human DNA sequence from clone 
RP4-583P15 on chromosome 20 
Lx)aiaios eo a s, o i oS, vjoos ana ten 
CpG islands. Contains the TNFRSF6B 
gene ior lumur necrosis lacior rcccpujr 
6b (decoy), the 3* part of the 

KIAA10SS penp the ARFRP1 pene for 

ADP-ribosylation factor related protein 
1 two penes for novel nroteins the 
gene for a GLUT4 enhancer factor and 
the gene for a novel zinc ringer protein 
similar to rat RJLN ZF and the gene for a 
novel BTB/POZ domain containing 
zinc finger protein, complete sequence. 






209 


gil3 162677 


Homo sapiens 


GLUT4 enhancer factor mRNA, 
complete cds. 


2055 


98 


209 


gil2655101 


Homo sapiens 


clone IMAGE:3 140406, mRNA, 
partial cds. 


1766 


100 


210 


gil4279329 


Homo sapiens 


ubiquirin specific protease (USP28) 
mRNA, complete cds. 


4131 


92 


210 


gi7959297 


Homo sapiens 


mRNA for KIAA1515 protein, partial 
cds. 


3872 


100 


210 


AAB31552 


Homo sapiens 


A human ubiquirin specific protease 25 
(USP25). 


2058 


48 


211 


AAB36579 


Homo sapiens 


Human FLEXHT-1 protein sequence 
SEQIDNO:l. 


1829 


100 . 


211 


AAB94048 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14211. 


1825 


99 


211 


gil0433984 


Homo sapiens 


cDNA FU12475 fis, clone 
NT2RM1000962. 


1825 


99 


212 


gil5824499 


Homo sapiens 


GaINAc-4-O-sulfotransferase 1 
mRNA, complete cds. 


2238 


100 


212 


gil 1990885 


Homo sapiens 


GaINAc4ST mRNA for GalNAc 4- 
sulfotransferase, complete cds. 


2238 


100 


212 


gil5559803 


Homo sapiens 


carbohydrate (N-acetylgakctosamine 
4-0) sulfotransferase 8, clone 
MGC:20987 IMAGE:4635405, mRNA, 
complete cds. 


2238 


100 
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213 


AAB43387 


Homo sapiens 


Human ORFX ORF3151 polypeptide 
sequence SEQ ID NO:6302. 


1056 


100 


213 


gil5292317 


Drosopbila 
melanogaster 


LD46863p 


549 


50 


213 


gi7302029 


Drosophila 
melanogaster 


CG12054 gene product 


549 


50 


214 


gil2843216 


Mus musculus 


putative 


913 


84 


214 


gil4585867 


Homo sapiens 


hypothetical protein SB 145 


297 


44 


214 


gll43ooioO 


Macaca 
fascicularis 


hypothetical protein 


/.yj 


AA 


lie 

215 


gll41332iy 


Homo sapiens 


mKJNA tor KlAAUoo J protein, partial 
cds. 


"71 0< 

/ ISO 


oo 
yy 


215 


gio580410 


Homo sapiens 


Human DNA sequence from clone 
RP3-467L1 on chromosome lp36.21- 
36.33. Contains the 3' part of gene 
KIAA0833, the VAMP3 gene for 
vesicic-associaieu memorane proiein j 
(cellubrevin), the PER3 gene for period 

^iJIUoUpUlio^ JlUIIlUlUg j dnu U1C gcuc 

for urotensin n. Contains two putative 

CnG island*; EST«s 9T5sq and fiSSs 
complete sequence. 


3642 


on 
yy 


215 


AAB42729 


Homo ^Anipns 


Human ORFX ORF2493 nolvnentide 
sequence SEQ ID NO:4986. 


997 


54 


216 


ci7293088 


Drosonhila 
melanogaster 


CG9213 eene nroduct 


811 


30 


216 


gil5810333 


Arabidopsis 
thabana 


unknown protein 


713 


28 


216 


ril3324888 


Caenorhabditis 
elegans 


Hypothetical protein B0361.2 


710 


34 


217 


ei2443331 

C| I ■ ******* A 


Xenoous 
laevis 


Nfri 


2421 


75 


217 


AAB34944 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 20 SEQ ID NO:148. 


1129 


91 


217 


gil 5292543 


Drosophila 
melanogaster 


SD06560p 


911 


36 


218 


ei7243111 

# 4h*Tm* M. ± A 


Homo saoiens 


mRNA for KIAA1365 protein, partial 
cds. 


3855 


100 


218 


gil657758 


Rattus 
norvegicus 


densin-180 


3640 


93 


218 


gi8570180 


Rattus 
norvegicus 


densin- 1 80 variant D 


1250 


83 


219 


gil4017839 


Homo sapiens 


mRNA for KIAA18 1 1 protein, partial 
cds. 


1726 


80 


219 


gi3217028 


Homo sapiens 


mRNA for putative serine/threonine 
jirotein kinase, partial. 


1450 


84 


219 


gi7294217 


Drosopbila 
melanogaster 


CG61 14 gene product 


1055 


70 


220 


gi7297674 


Drosophila 
melanogaster 


CG13139 gene product 


942 


75 


220 


gil2857050 


Mus musculus 


putative 


767 


62 


220 


gil 5636900 


Gallus gallus 


avEna neural variant 


139 


52 


221 


gil 5489242 


Homo sapiens 


clone IMAGE:3859726, mRNA, 


1001 


88 



160 



WO 02/081731 



PCTAJS02/01222 



Table 2 



SEQ ID NO: 


Accession No. 


Species 


Description 


Score 


% 
Identity 








partial cds. 






221 


gil3543991 


Homo sapiens 


clone IMAGE:3627860, mRNA, 
partial cds. 


1001 


88 


221 


gil2847182 


Mus rausculus 


putative 


328 


39 


222 


gil4133209 


Homo sapiens 


mRNA for KIAA0654 protein, partial 
cds. 


6089 


99 


222 


gi930343 


Homo sapiens 


Human LAR-interacting protein lb 
mRNA, complete cds. 


3559 


60 


££J. 


gli>,JU34 1 


Homo sapiens 


Human LAR-interacting protein la 
mRNA, complete cds. 


3503 


60 


223 


gil2620207 


Homo sapiens 


Clor£25 mRNA, complete cds. 


3807 


98 


Z15 


glS/505430 


Homo sapiens 


Human DNA sequence from clone 
GS1-120K12 on chromosome lq25.3- 
31 .2. Contains the gene for ring finger 
protein DING or BAP-1, an FTH1 
(rernniij neavy poiypepnae l ) 
pseudogene, the 3' end of the gene for a 
novel protein similar to archaeal, yeast 
and worm N2,N2-dimethylguanosine 
tRNA methyltransferase, ESTs, STSs, 
GSSs and two putative CpG islands, 

^uiiipicic acijuciil/C. 


2300 


98 


223 


oil 7815704 




puUtUVv 




oo 


224 


gil4595658 


Xenopus 
laevis 


UM protein prickle 


2865 


67 


224 


gil0727796 


Drosophila 

rnp la n o on <? t**r 

lUCIoUUgtto IC1 


esn gene product 


698 


42 


224 


£i6634092 


Dros nnhila 
melanopaster • 








225 


gil3375149 


Homo sapiens 


Human DNA sequence from clone 
RP5-1 1 18M15 on chromosome 20 
Contains part of a gene similar to PI 4 
Bos taurus (P14L), a novel gene, ESTs, 
STSs, GSSs and a CpG Island, 
complete sequence. 


957 


99 


225 


gi7259265 


Mus musculus 


contains transmembrane (TM) region 


314 


50 


225 


AAY53871 


Homo sapiens 


A human brain-derived signalling 
factor polypeptide. 


299 


45 


226 


gil2803987 


Homo sapiens 


clone MGC:4174 IMAGE:3634226, 
mRNA, complete cds. 


743 


100 


226 


gil2805417 


Mus musculus 


Unknown (protein for MGC:7354) 


444 


66 


226 


gil2849498 


Mus musculus 


putative 


235 


72 


227 


AAY91629 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 23 SEQ ID NO:302. 


1391 


87 


227 


gi7677403 


Homo sapiens 


F-box protein FBG2 (FBG2) mRNA, 
complete cds. 


1391 


87 


227 


AAY83046 


Homo sapiens 


F-box protein FBP-6. 


1333 


82 


228 


gil5079958 


Homo sapiens 


chromosome 1 1 open reading frame 
24, clone MGQ19741 
EMAGE:3614861, mRNA, complete 
cds. 


2231 


99 


228 


gil 1527205 


Homo sapiens 


DM4E3 (CI lorf24) mRNA, complete 
cds. 


2224 


99 



161 



WO 02/081731 



PCT/US02/01222 



Table 2 



SEQ ID NO: 


Accession No. 


Species 


Description 


Score 


% 
Identity 


228 


AAB 18965 


Homo sapiens 


Amino acid sequence of a human 
transmembrane protein. 


2055 


99 


229 


eil5930199 


Homo sapiens 


Similar to RKEN cDNA 4921523118 
gene, clone MGC:9467 
IMAGE:3914747, mRNA, complete 
cds. 


1451 


99 


229 


eil3278594 


Mus musculus 


RIKEN cDNA 4921523118 gene 


1440 


97 


229 


eil2856904 


Mus musculus 


putative 


1440 


97 


230 


gil5680131 


Homo sapiens 


hypothetical protein FLJ12171 , clone 
MGC:19889 IMAGE:4652087, mRNA, 
complete cds. 


1638 


100 


2W 


vi 14043242 


Homo saniens 


hypothetical protein FLJ 12171, clone 
MGC:15694 IMAGE.3351601, mRNA, 
complete cds. 


1638 


100 


230 


AAB93912 


Homo sapiens 


Human protein sequence SEQ ID 
NO:13880. 


1634 


99 


231 


AAB56947 


Homo sapiens 


Human prostate cancer antigen protein 
sequence SEQ ID NO: 1525. 


779 


100 


231 


AAB68408 


Homo sapiens 


Amino acid sequence of a human 
NOV1 polypeptide. 


574 


100 


231 


AAY81695 


Homo sapiens 


Human PTN protein sequence. 


574 


100 


232 


gill 138034 


Homo sapiens 


mRNA for KIAA1 173 protein, 
complete cds. 


2665 


100 


232 


AAG89259 


Homo sapiens 


Human secreted protein, SEQ ID NO: 
379. 


2654 


99 


232 


gil2834372 


Mus musculus 


putative 


2427 


90 


233 


AAB98612 


Homo sapiens 


Human tumour suppressor gene, 
TSG16, protein. 


1706 


55 


233 


gil 1596412 


Homo sapiens 


GAC-1 (GAC-1) mRNA, complete cds. 


893 


77 


233 


gi4240237 


Homo sapiens 


mRNA for KIAA0874 protein, partial 
cds. 


893 


77 


234 


AAB41108 


Homo sapiens 


Human OREX ORF872 polypeptide 
sequence SEQ ID N0.1744. 


4170 


99 


234 


gi6331287 


Homo sapiens 


mRNA for KIAA1274 protein, partial 
cds. 


3936 


99 


234 


pil545959 


Mus musculus 


paladin 


3560 


80 


235 


ei9368849 


Homo sapiens 


mRNA; cDNA DKFZp761G21 13 
(from clone DKFZp761G2113). 


972 


99 


235 


gi7293878 


Drosophila 
melanogaster 


CG13379 gene product 


274 


36 


235 


gil4532482 


Arabidopsis 
thaliana 


AT5g58570/mznl_20 


152 


31 


236 


gi3242242 


Mus musculus 


hyperpolarization-activated cation 
channel, HAC2 


4309 


91 


236 


gi7407645 


Rattus 
norvegicus 


hyperpolarization-activated, cyclic 
nucleotide-gated potassium channel 1 


4306 


91 


236 


gi2708316 


Mus musculus 


brain cyclic nucleotide gated 1; Bcng- 
1; brain specific ion channel protein 


4301 


91 


237 


AAB 13370 


Homo sapiens 


Human brain-associated protein 
HBAP-1. 


1055 


100 


237 


Si9944291 


Homo sapiens 


TTYH1 mRNA, complete cds. 


1055 


100 


237 


gi9651109 


Macaca 
fascicularis 


TTYH1 * 


1032 


98 
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238 


AAU00476 


Homo sapiens 


Human INTERCEPT 400 protein. 


1428 


100 


238 


AAY79266 


Homo sapiens 


Human elongase homologue HS3. 


1428 


100 


238 


AAB29648 


Homo sapiens 


Human membrane-associated protein 
HUMAP-5. 


1428 


100 


239 


AAB84885 


Homo sapiens 


Human protein, SEQ ID 14, 


4029 


99 


239 


AAB84882 


Homo sapiens 


Human protein, SEQ ID 6. 


4029 


99 


239 


gi5262593 


Homo sapiens 


mRNA; cDNA DKFZp434N093 (from 
clone DKFZp434N093); partial cds. 


3684 


99 


240 


gil3477247 


Homo sapiens 


Similar to RIKEN cDNA 
5031400M07 gene, clone MGC:13079 
MAGE:3840918, mRNA, complete 
cds. 


2153 


100 


240 


AAB18987 


Homo sapiens 


Amino acid sequence of a human 
transmembrane protein. 


2148 


99 


240 


gi7670425 


Mus museums 


unnamed protein product 


1904 


89 


241 


AAG63222 


Homo sapiens 


Amino acid sequence of a human lipid 
metabolism enzyme. 


2194 


100 


241 


gi!4861069 


Mus mus cuius 


phosphatidyl inositol phosphate kinase 
type II gamma 


2120 


95 


241 


gi3387798 


Rattus 
norvegicus 


phosphatidylinositol 5-phosphate 4- 
kinase gamma 


2087 


95 


242 


gi7295732 


Diosophila 
melanogaster 


ft gene product 


2915 


39 


242 


gil57409 


DrosophiJa 
melanogaster 


fat protein 


2901 


39 


242 


gil0727403 


Drosophila 
melanogaster 


ds gene product 


2236 


34 


243 


AAF90315 aa 
2 


Homo sapiens 


Winged helix/zinc finger transcription 
factor FOXP1 cDNA. 


819 


98 


243 


AAB82339 


Homo sapiens 


Winged helix/zinc finger transcription 
factor FOXP1. 


819 


98 


243 


gil2043714 


Homo sapiens 


clone pAB195 FOXP1 (FOXP1) 
mRNA, complete cds. 


819 


98 


244 


gil0440073 


Homo sapiens 


cDNA: FLJ23399 fis, clone HEP18254. 


2620 


100 


244 


gi7018524 


Homo sapiens 


mRNA; cDNA DKFZp762K137 (from 
clone DKFZp762K137); partial cds. 


2524 


100 


244 


gil4133227 


Homo sapiens 


mRNA for KIAA0970 protein, partial 
cds. 


1367 


51 


245 


AAB94855 


Homo sapiens 


Human protein sequence SEQ ID 
NO:16042. 


1347 


100 


245 


gil0436290 


Homo sapiens 


cDNA FIJI 3968 fis, clone 
Y79AA1001493, weakly similar to 
UBIQUITIN-CONJUGATING 
ENZYME E2-17 KD 9 (EC 6.3.2.19). 


1347 


100 


245 


gil6198439 


Homo sapiens 


hypothetical protein FU13855, clone 
MGC:16842 IMAGE:3915698, mRNA, 
complete cds. 


1347 


100 


246 


gi6330302 


Homo sapiens 


mRNA for KIAA 1 1 85 protein, partial 
cds. 


2043 


100 


246 


AAG74603 


Homo sapiens 


Human colon cancer antigen protein 
SEQIDNO:5367. 


1530 


97 


246 


AAB53321 


Homo sapiens 


Human colon cancer antigen protein 
sequence SEQ ID NO:861. 


1530 


97 
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247 


gi535390 


Macronuclear 
Homo sapiens 


Human cellular retinol binding protein 
II (CRBPII) mRNA, complete cds. 


715 


99 


247 


gi397352 


Mus musculus 


mCRBPH 


674 


91 


247 


gil2833902 


Mus musculus 


putative 


669 


90 


248 


AAG01285 


Homo sapiens 


Human secreted protein, SEQ ID NO: 
5366. 


209 


87 


248 


AAR05562 


Homo sapiens 


Laminin -binding protein encoded by 
insert from J9 lambda gtlO phage. 


209 


87 


248 


gil 149509 


Galhis gallus 


37kD Larninin receptor precursor /p40 
ribosomal associated protein 


209 


87 


249 


gil3162226 


Homo sapiens 


Human DNA sequence from clone 
RP4-543 J19 on chromosome 20 
Contains part of the GNAS1 gene 
encoding guanine nucleotide binding 
protein (G protein, alpha stimulating 
activity polypeptide 1) including 
neuroendocrine secretory protein 55 
^iNEorjj ) t me v> 1 o^/\ gene encoding 
cathepsin Z, the ATP5E gene encoding 
ATP synthase (H+ transporting, 
mitochondrial Fl complex, epsilon 
subunit), the gene encoding protein 
HSPC130 (TH1 Drosophila homolog), 
the gene for tubulin beta 1 class VI 
(TUBB1), a gene encoding the CGI- 
107 protein (LOC51012), four CpG 
islands, ESTs, STSs and GSSs, 
complete sequence. 


1591 


100 


249 


gil 1230445 


Homo sapiens 


TUBB1 gene for human beta tubulin 1, 
class VI. 


1591 


100 


249 


gi212834 


Gallus gallus 


beta-tubulin 


1340 


85 


250 


gil3162226 


Homo sapiens 


Human DNA sequence from clone 
RP4-543 J19 on chromosome 20 
Contains part of the GNAS1 gene 
encoding guanine nucleotide binding 

nrntpin / T\Yf\tf*\n olnVia ctimiilntinor 

piuicm ^vj pruicuj, aipua bumuxaixng 
activity polypeptide 1) including 

ti *»i tttipti i\ n/*riTi p c &PT&tf\ ru t^to+pih ^ ^ 
iiwiuvciiuuwiuiv ocuictuiy uiuiwu jj 

fNESP55^ the CTS7A oene encoding 
cathepsin Z, the ATP5E gene encoding 
ATP synthase (H+ transporting, 
mitochondrial Fl complex, epsilon 
subunit), the gene encoding protein 
HSPC130 (TH1 Drosophila homolog), 
the gene for tubulin beta 1 class VI 
(TUBB 1), a gene encoding the CGI- 
107 protein (LOC51012), four CpG 
islands, ESTs, STSs and GSSs, 
complete sequence. 


1986 


100 


250 


gil 1230445 


Homo sapiens 


TUBB 1 gene for human beta tubulin 1, 
class VI. 


1986 


100 


250 


gi212834 


Gallus gallus 


beta-tubulin 


1699 


85 


251 


gi559325 


Homo sapiens 


Human mRNA for ATP synthase alpha 


1566 


99 
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subunit, complete cds. 






251 


gi559317 


Homo sapiens 


Human gene for ATP synthase alpha 
subunit, complete cds (exon 1 to 12). 


1566 


99 


251 


gi34468 


Homo sapiens 


H. sapiens mRNA for mitochondrial 
ATP synthase. 


1566 


99 


252 


gi559325 


Homo sapiens 


Human mRNA for ATP synthase alpha 
subunit, complete cds. 


2192 


84 


252 


gi559317 


Homo sapiens 


Human gene for ATP synthase alpha 
subunit, complete cds (exon 1 to 12). 


2192 


84 


252 


gi34468 


Homo sapiens 


H.sapiens mRNA for mitochondrial 
ATP synthase. 


2192 


84 


253 


gil4550508 


Homo sapiens 


Similar to CG8974 gene product, clone 
MGC:2460 IMAGE:2964524, mRNA, 
complete cds. 


1051 


100 


253 


gil5928691 


Mus musculus 


Unknown (protein for MGC: 1 9394) 


1036 


98 


253 


gi7293133 . 


Drosophila 
melanogaster 


CG8974 gene product 


608 


66 


254 


AAE04880 


Homo sapiens 


Human protease protein-7 (PRTS-7). 


2795 


100 


254 


gil4043577 


Homo sapiens 


hypothetical protein FU12455, clone 
MGC:13149 IMAGE:4298740, mRNA, 
complete cds. 


2795 


100 


254 


AAB94023 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14157. 


2781 


99 


255 


gi2501855 


Homo sapiens 


22 kDa actm-binding protein (SM22) 
gene, complete cds. 


937 


95 


255 


gi2340833 


Homo sapiens 


DNA for SM22 alpha, complete cds. 


937 


95 


255 


gi2335047 


Homo sapiens 


mRNA for SM22 alpha, complete cds. 


937 


95 




gl x Di/aVZ.Vt 


nomo sapiens 


sinmar 10 proKaryonc-rype ciass i 
peptide chain release factors, clone 
MGC:20261 IMAGE:3029407, mRNA, 
complete cds. 




yy 


256 


tn 670665 8 


T-TnTiir» canimc 
i HJiiiu oapicuo 


nullum uivx\ ocijuciii^c uuixi ciuuc 

RP1-101K10 on chromosome 6q25-26. 
Contains a novel gene, the gene for a 
novel protein similar to Prokaryotic- 

fvnp place T v\PT}fi{\f* pVinin Tf*1fflCf» 

factors, the 3* end of gene RGS17 
(RGSZ2) for regulator of G-protein 
signaling 17, ESTs, STSs, GSSs and 
two putative CpG islands, complete 
sequence. 




00 

yy 


256 


gil5680165 


Homo sapiens 


similar to prokaryotic -type class I 
peptide chain release factors, clone 
MGC:20252 IMAGE:4646472, mRNA, 
complete cds. 


1375 


98 


257 


gil5080204 


Homo sapiens 


similar to prokaryotic-type class I 
peptide chain release factors, clone 
MGC:20261 MAGE:3029407, mRNA, 
complete cds. 


1706 


90 


257 


gi6706658 


Homo sapiens 


Human DNA sequence from clone 
RP1-101K10 on chromosome 6q25-26. 
Contains a novel gene, the gene for a 
novel protein similar to Prokaryotic- 


1698 


89 
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sfo rn no* 




OUCUC3 


TiMrrintifin 

JJCoVJ lllllUll 


otui c 


/o 

Identity 








hmp f luce T r\/*ntiHp f*Vinrn rf»1f»;*c#» 
iypc ciooo J. pcuuuc miam lcicooc 

factors, the 3 1 end of gene RGS17 
fRGSZ2 1 for regulator of G-nrotein 
signaling 17, ESTs, STSs, GSSs and 
two nutative CnG inlands corrmlete 
sequence. 






257 


gil5680165 


Homo sapiens 


similar to prokaryotic-type class I 
peptide chain release factors, clone 
MGC.20252 IMAGE:4646472, mRNA, 

c onrnletp cds 


1133 


85 


7SR 


0,7905482 


TYrn cfYnVi i 1 a 
J-SlUoUpillla 

melanogaster 






41 


258 


gil2322327 


Arabidopsis 
thaJiana 


unknown protein 


451 


46 


7^R 




ATaUiaopSlS 


unknown proiem 




HO 


7^Q 




Homo sapiens 


xiuman proiem sequence ocy il> 
NO: 17548. 


i 


ion 


7^Q 




Homo sapiens 


cjjina rLJ i*t /*iu ns, cione 

"KTT7PP^ 007^07 wnVIv Qimilor fr\ 
IN 1 Z.XVT j WJ&UU&f WCofijy olllUlai LO 

PROBABLE PROTEIN DISULFIDE 
ISOMFRASE ER-60 PRECURSOR 
(EC 5.3.4.1). 


^01 1 


ioo 
1UU 


259 


ril5862252 


Homo saniens 


iinnampd nrotpin OTodurt 


5008 




260 


gil5079416 


Homo sapiens 


secreted modular calcium-binding 
nrotem 1 clone MGC* 198 95 
IMAGE:4549051, mRNA, complete 
cds. 


2359 


100 


260 


AAB 19394 


Homo sapiens 


Amino acid sequence of a human 
secreted protein. 


2355 


99 


260 


gil0432431 


Homo sapiens 


mRNA for secreted modular calcium- 
binding protein (smocl gene). 


2343 


99 


261 


gi7020475 


Homo sapiens 


cDNA FU20400 lis, clone KAT00587. 


1687 


100 


261 


gill 18097 


Caenorhabditis 
elegans 


proline and glycine-rich 


268 


33 


261 


AAW49723 


Homo sapiens 


Protein polymer adhesive substrate 
PPAS1-F. 


261 


32 j 


262 


gil6197949 


Drosophila 
melanogaster 


LD21896p 


325 


29 


262 


gi7293303 


Drosophila 
melanogaster 


CG9089 gene product 


325 


29 


262 


gi3170539 


Takifugu 
rubripes 


unknown 


291 


40 


263 


AAB42525 


Homo sapiens 


Human ORFX ORF2289 polypeptide 
sequence SEQ ID NO:4578. 


3570 


80 


263 


gi2887497 


Homo sapiens 


chromosome 19, overlapping cosmids 
R28707 and R34001, complete 
sequence. 


3570 


80 


263 


AAB42538 


Homo sapiens 


Human ORFX ORF2302 polypeptide 
sequence SEQ ID NO.4604. 


2835 


99 


264 


gil4017849 


Homo sapiens 


mRNA for KIAA1816 protein, partial 
cds. 


1637 


99 


264 


gi8655687 


Homo sapiens 


mRNA; cDNA DKFZp762E1511 


892 


100 



166 



WO 02/081731 



PCT/US02/01222 



Table 2 



SEQ ID NO: 


Accession No. 


Species 


Description 


Score 


% 
Identity 








(from clone DKF2p762E15 1 1). 






264 


gi6979930 


Homo sapiens 


Maml mRNA, partial cds. 


315 


30 


265 


gil2836420 


Mus musculus 


putative 


2511 


93 


265 


gil0437002 


Homo sapiens 


cDNA: FLJ21013 fis, clone 
CAE05223. 


1859 


99 


265 


AAB58385 


Homo sapiens 


Lung cancer associated polypeptide 
sequence SEQ ID 723. 


1704 


99 


266 


gil4198321 


Mus musculus 


ribosomal protein L3 1 


543 


92 


266 


&57US 


Rattus 
nozvegicus 


ribosomal protein L31 (AA 1-125) 


543 


92 


266 


gil4586963 


Mus musculus 


M75 


543 


92 


267 


gil78424 


Homo sapiens 


Human apolipoprotein A-II mRNA, 
complete cds. 


478 


96 


267 


gi296634 


Homo sapiens 


Human gene for apolipoprotein All. 


478 


96 


267 


gi296633 


Homo sapiens 


Human DNA for apolipoprotein A-II. 


478 


96 


268 


AAB47184 


Homo sapiens 


ACPLX protein sequence. 


3571 


100 


268 


gi7321168 


Homo sapiens 


Human DNA sequence from clone 
RP5-860F19 on chromosome 20pl2.3- 
13 Contains the gene for KIAA1442 
(similar to olfactory neuronal 
transcription factors (COEl, COE2, 
COE3, EBF3, OLF1)), RPL19 (60S 
ribosomal protein L19) and HSPC080 
pseudogenes, the gene for 
metallocarboxypeptidase (CPX-1) and 
a novel gene. Contains ESTs, STSs, 
GSSs and four CpG islands, complete 
sequence. 


3571 


100 


268 


AAB36174 


Homo sapiens 


Human APG04 protein. 


3567 


99 


269 


gi23 14829 


Homo sapiens 


jerky gene product homolog mRNA, 
complete cds. 


1430 


59 


269 


gil0140857 


Mus musculus 


jerky 


752 


33 


269 


AAG62624 


Homo sapiens 


Human cell nucleus regulatory protein 
56. 


598 


34 


270 


gi7959227 


Homo sapiens 


mRNA for KIAA1483 protein, partial 
cds. 


2231 


99 


270 


gi34192 


Homo sapiens 


Human KUP mRNA for protein with 
two zinc fingers. 


627 


39 


270 


gil33 10782 


Mus musculus 


myoneurin 


315 


24 


271 


AAB93814 


Homo sapiens 


Human protein sequence SEQ ID 
NO:13604. 


1408 


97 


271 


gil0433080 


Homo sapiens 


cDNA FIJI 1753 fis, clone 
HEMBA1005583. 


1408 


97 


271 


AAB41771 


Homo sapiens 


Human ORFX ORF1535 polypeptide 
sequence SEQ ID NO:3070. 


821 


99 


272 


gi7959197 


Homo sapiens 


mRNA for KIAA1468 protein, partial 
cds. 


4603 


100 


272 


gil5080502 


Homo sapiens 


clone MGC:16944 IMAGE:4339646, 
mRNA, complete cds. 


4317 


94 


272 


gi9755831 


Arabidopsis 
thaliana 


putative protein 


675 


27 


273 


gil5080502 


Homo sapiens 


clone MGC: 16944 IMAGE:4339646, 
mRNA, complete cds. 


4362 | 


98 
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273 


gi7959197 


Homo sapiens 


mRNA for KIAA1468 protein, partial 
cds. 


4360 


96 


273 


gi9755831 


Arabidopsis 
tha liana 


putative protein 


704 


28 


274 


AAB92483 


Homo sapiens 


Human protein sequence SEQ ID 
NO:10570. 


2626 


100 


274 


gi7021875 


Homo sapiens 


cDNAFLJ10051fis, clone 
HEMBA1001281. 


2626 


100 


274 


gil2837616 


Mus museums 


putative 


2065 


90 


275 


gil07 16076 


Homo sapiens 


mRNA for testis-abundant finger 
protein, complete cds. 


2739 


100 


275 


gil4043332 


Homo sapiens 


Similar to ring finger protein 23, clone 
MGC:2475 1MAGE:3051389, mRNA, 
complete cds. 


2533 


94 


275 


gil07 16078 


Mus museums 


testis-abundant finger protein 


2497 


92 


276 


AAB44673 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 33 SEQ ID NO:138. 


1014 


96 


276 


gi!747 


Oryctolagus 
cuniculus 


trichohyalin 


213 


22 


276 


gil3936996 


Human 
herpesvirus 8 


ORF73 


203 


22 


277 


AAG74326 


Homo sapiens 


Human colon cancer antigen protein 
SEQIDNO:5090. 


1101 


100 


277 


AAB56461 


Homo sapiens 


Human prostate cancer antigen protein 
sequence SEQ ID NO:1039. 


778 


100 


277 


gil2842930 


Mus musculus 


putative 


688 


90 


278 


oi 1020 145 


fTrvmn c^nipnc 


Hiimnn DMA h-mHino nrotpin fHP\<J\ 

mRNA, complete cds. 




47 


278 


eil4456631 


Homo <mnieri<i 


Human DNA seouence from clone 

11UI 1 Mil ±*r 1 1X1 UXAJUviivV UU1U VlUllW 

RP1-54B20 on chromosome Xpl 1.1* 
1 1.3. Contains the 5' end of a novel 
SSX family protein gene, two novel 
KRAB box containing C2H2 type zinc 
finger protein genes, a KRAB box 
protein pseudogene, the gene for a 
novel protein similar to lysozyme C 
(1,4-beta-N^c^tylrnurajnidase), the 
ZNF81 gene for zinc finger protein 81 
(HFZ20), ESTs, STSs, GSSs and three 
CpG islands, complete sequence. 


1497 


55 


278 


gi498152 


Homo sapiens 


Human mRNA for KIAA0065 gene, 
partial cds. 


1495 


46 


279 


gi2914676 


Homo sapiens 


chromosome 16, cosmid clone 360H6 
(LANL), complete sequence. 


882 


35 


279 


gil4250678 


Homo sapiens 


clone MGG10489 IMAGE:3945548, 
mRNA, complete cds. 


882 


35 


279 


gi2342506 


Homo sapiens 


mRNA for zinc finger protein FPM3 1 5, 
complete cds. 


875 


35 


280 


gi434779 


Homo sapiens 


Human mRNA for KIAA01 12 gene, 
partial cds. 


2072 


100 


280 


gil5278392 


Homo sapiens 


homolog of yeast ribosome biogenesis 
regulatory protein RRS1, clone 
MGC.-4831 EV1AGE:3603972, mRNA, 


1905 


100 
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comnlete cds 






280 


gil2804751 


Homo sapiens 


Similar to regulator for ribosome 

rp<5iQta?inp hnmnlno cprpvi^iae^ 

clone MGC:2755 MAGE:2824034 J 
mRNA, complete cds. 


1905 


100 


281 


AAB95761 


Homo Ranif*n<» 


Human rvrntpin <;pfmpnce SPO TD 

NO* 18686 


789 


100 


281 


AAG81272 


Homo sapiens 


Human AFP protein sequence SEQ ID 
NO:62. 


789 


100 


281 


gil4035852 


Homo sapiens 


unnamed protein product 


789 


100 


9R9 
ZOZ 


tnlSfiROQI 1 


Ji 01110 sapiens 


Dco-pojy^/v ) polymerase zzuun/i, 
complete cds. 


^709 


00 


282 


gil5384858 


Homo sapiens 


mRNA for poly(A) polymerase gamma 
(rArULu genej. 


3191 


99 


282 


gil3641252 


Homo sapiens 


SRP RNA 3 1 adenylaring en2yme/pap2 
mRNA, complete cds. 


3779 


99 




gioou / oyo 


Homo sapiens 


„r>xj a . ,*tyma t\yw7*^ia a 1 m a 

mKJNA, CUJNA JJJSJ*Z4>43**A1U1*# 

(from clone DKFZp434A1014); partial 
cds. 


1 All 




283 


gil2853788 


Mus musculus 


putative 


408 


38 ! 


ZOJ 




CD Op US 
lac v 10 


speedy protein 


1 ZA 


zo 


284 






ml? "NT A fnr ITT A A 0/^94 nrntpin narfial 

cds. 


1 A170 


OQ 

yy 


284 


gi!3702612 


Staphylococci! 
s aureus subsp. 
aureus N3 15 


ORFID : S A244 7~hypotheticaI protein, 
similar to streptococcal hemagglutinin 
protein 


223 


19 


284 


gil4248429 


Staphylococci 
s aureus subsp. 
aureus Mu50 


hypothetical protein 


223 


19 


285 


gil2697941 


Homo sapiens 


mRNA for KIAA1698 protein, partial 
cds. 


4716 


100 


285 


gi7299794 


Drosophila 
melanogaster 


CG9591 gene product 


290 


31 


9R5 

ZOJ 


A/xK^S/ZjO 


Homo sapiens 


Natural killer lytic associated protein. 


92 


40 


286 


AAG62395 


Homo sapiens 


Human zinc ringer protein 46. 


2375 


100 


9RA 
ZOO 


gl/D /OZ/H 


Homo sapiens 


Human DNA sequence from clone 
RP1 1-393J16 on chromosome 10. 
contains pan 01 trie /JNro3A gene ior 
zinc fin eer nrotein 33a fiCOX 3 1 ^ a 
novel gene for a novel KRAB box 
containing zinc finger gene, a zinc 
finger pseudogene, ESTs, STSs, GSSs 
and two putative CpG islands, complete 
sequence. 


2015 


100 


286 


gi881564 


Homo sapiens 


Human zinc finger containing protein 
ZNF157 (ZNF157) mRNA, complete 
cds. 


1339 


51 


287 


gi2822143 


Homo sapiens 


chromosome 19, cosraid R30217, 
complete sequence. 


1838 


53 


287 


gi9968290 


Homo sapiens 


mRNA for zinc finger protein (ZNF304 
gene). 


1735 


50 


287 


gil3543419 | Homo sapiens 


Similar to zinc finger protein 304, 


1735 


51 
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clone MGC:4079 IMAGE:3530863, 
mRNA, complete cds. 






288 


ei540469 


Homo saniens 


(clone HGT26) T cell receptor gamma- 
chain mRNA, V region. 


399 


91 


288 


ei3 047024 


Homo saniens 


T-cell recentor pamma VI eene region. 


384 


100 


288 


gi339167 


Homo saniens 


Human T-cell receptor rearranged 
gamma-chain gene V-region (V4) 
(subgroup I). 


384 


100 


289 


AAY69976 


Homo saniens 


DHFR-HM nrotein. 


886 


93 


289 


ei 182724 


Homo saniens 


Human dihvdrofolate reductase eene. 


886 


93 


289 


gil82717 


Homo saniens 


Human dihvdrofolate reductase i?ene 

null irti I uiuyiuuiviBiv ivuui/uii>v ^wuvj 

exon 6 and 3* flank. 


886 


93 


290 


AAE01782 


Pattia campnQ 

XXUUiU QdUKlU 


Human oene 1 ^ pn coded secreted 

protein HDPNW93, SEQ ED NO:103. 


4269 


99 


290 


ci 10437433 


Homo saniens 


cDNA* FLJ91347 fis clone 
COL02724. 


4127 


97 


290 


AAB74693 


Homo sapiens 


Human protease and protease inhibitor 
PPIM-26. 


3948 


99 


901 




lMiic mn cpii iiic 


XLl^tXX-' 


955 


90 


291 


gil2844277 


Mus mus cuius 


putative 


800 


79 


901 


A A Yl 9^1 fl 
An I Jx. J 1U 


noino aapicUb 


Human PQT cprrpf/>H rvrrtfp in QPO TH 

NO:541. 


645? 

\J*tO 




909 




XXUXUU OAjJlCXld 


X W 1 xv*t. 


2798 




292 


gil5141735 


Homo sapiens 


unnamed protein product 


2798 


98 ! 


909 




xiumo suipiciib 


miviN/\ lur isiuuixjuMJLxic iz upcii 

TParlino finamp ^ ff'l 9oT"f^ i 
xcauixl£ xxaxxlv J ^vIavIIj y. 


914 


94 


293 


gil0440367 


Homo sapiens 


mRNA for FU00018 protein, partial 
cds. 


5938 


100 


293 


gil5488570 


Homo sapiens 


Similar to hypothetical protein 
FO00018 clone MGC* 10073 
IMAGE:3896004, mRNA, complete 
cds. 


4736 


99 


293 


gil0438857 


Homo sapiens 


cDNA: FU22458 fis, clone 
HRC10001. 


1570 


99 


294 


AAB08948 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 2 1 SEQ ID NO: 1 05. 


1601 


99 


294 


AAB08911 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 21 SEQ ID NO:68. 


1601 


99 


294 


AAB80238 


Homo sapiens 


Human PR0238 protein. 


641 


44 


295 


AAB18457 


Homo sapiens 


A human TANGO 216 polypeptide 
clone. 


2106 


98 


295 


AAB18447 


Homo sapiens 


Amino acid sequence of human 
TANGO 216 polypeptide. 


2106 


98 


295 


gil4017381 


Homo sapiens 


tumor endothelial marker 8 precursor 
(TEM8) mRNA, complete cds. 


1231 


57 


296 


gil4388342 


Macaca 
fascicularis 


hypothetical protein 


3833 


92 


296 


gi7243195 


Homo sapiens 


mRNA for KIAA1407 protein, partial 
cds. 


3817 


100 


296 


gil5451319 


Macaca 
fascicularis 


hypothetical protein 


2408 


91 


297 


gi7243039 


Homo sapiens 


mRNA for KIAA1329 protein, partial 
cds. 


4761 


100 



170 



WO 02/081731 



PCT/US02/01222 



Table 2 



SEQIDNO: 


Accession No. 


Species 


Description 


Score 


% 

¥-l..— ji 

identity 


297 


gil2007720 


Mus musculus 


VPS 10 domain receptor protein 
oonJo/ 


4466 


88 


297 


£7715916 


Mus musculus 


SorCSb splice variant of the VPS10 
domain receptor SorCS 


2177 


47 


298 


AAM00812 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 175. 


1488 


99 


298 


gil2846045 


Mus musculus 


putative 


1387 


65 


298 


AAM00925 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 401. 


996 


100 


299 


gi7298852 


Drosophila 
melanogaster 


CG10068 gene product 


609 


43 


299 


gi8655669 


Homo sapiens 


mRNA; cDNA DKFZp547C176 (from 
clone DKFZp547C176). 


482 


52 


299 


AAB42048 


Homo sapiens 


Human ORFX ORF1812 polypeptide 
sequence SEQ ID NO:3624. 


325 


46 


300 


gil4043285 


Homo sapiens 


Similar to KIAA0808 gene product, 
clone MGC.15880 1MAGE:3529159, 
mRNA, complete cds. 


1306 


97 


300 


gi7263912 


Homo sapiens 


Human DNA sequence from clone 
RP5-1 108D1 1 on chromosome 20ql2- 
13. 1 1 Contains part of the gene for a 
novel protein similar to C. elegans 
T22C1.7, part of the gene for a novel 
HMG (high mobility group) box 
protein similar to KIAA0737, 
KIAA0808 and TNRC9 (CAGF9), 
ESTs, STSs, GSSs and two putative 
CpG islands, complete sequence. 


797 


96 


300 


gi3882337 


Homo sapiens 


mRNA for KIAA0808 protein, 
complete cds. 


767 


55 


301 


gil5430292 


Homo sapiens 


muscle alpha-kinase (MAK) mRNA, 
complete cds. 


5445 


99 


301 


gl7243041 


Homo sapiens 


T\\T » X7"t A Alport — A • . « « 

mRNA for KIAA1330 protein, partial 
cds. 


4933 


100 


301 


gil4331137 


Mus musculus 


myocyhc mducUon/difterentiation 
originator 


3684 


72 


302 


gil4550508 


Homo sapiens 


Similar to CG8974 gene product, clone 
MGC:2460 IMAGE:2964524, mRNA, 
complete cds. 


589 


100 






Mus musculus 


Unknown (protem for MGC: 1 9394) 


D /H 


0*7 


302 


gi2564951 


Mus musculus 


unknown 


378 


72 


303 


gi7242955 


Homo sapiens 


mRNA for KIAA1300 protein, partial 
cds. 


9573 


99 


303 


gi6599162 


Homo sapiens 


mRNA; cDNA DKF2£434N1272 
(from clone DKFZp434N1272); partial 
cds. 


1392 


98 


303 


AAG75083 


Homo sapiens 


Human colon cancer antigen protein 
SEQIDNO:5847. 


628 


92 


304 


gil408209 


Homo sapiens 


Human endogenous retrovirus HERV- 
K(HML6) proviral clone HML6.17 
putative polymerase and envelope 
genes, partial cds, and 3'LTR. 


398 


86 


304 


Ei2801455 


Mouse | Prl60 


176 


48 
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Identity 






mammary 
tumor virus 








304 


'Ct\ 1 1 ooo 

gi691 lzaa 


Exogenous 
mouse 
mammary 
tumor virus 


T>„ "D^l 

Cjag-Jrro-roJ 


1 /o 




305 


gi 14269502 


Homo sapiens 


unconventional myosin 1G valine form 
(MYOIG) mRNA, MYOIG-V allele, 
partial cds. 


5Zoy 


ys 


305 


gil4269504 


Homo sapiens 


unconventional myosin 1G methonine 
iorm(MYOlO) mKNA, MYUIO-M 
allele, partial cds. 


3266 


97 


305 


gi3724141 


Rattus 
norvegicus 


myosin I 


1 1 if\ 


3/ 


306 


gl2 145060 


Homo sapiens 


1 1 r-1 interacting peptide Zl) mKNA, 
partial cds. 


oa.qi 
AVai 


DO 

yy 


306 


gl2224593 


Homo sapiens 


Human mKMA tor isJAAUizo gene, 
partial cds. 


045 


•JO 

jy 


in/ 

306 


gi455555 


Homo sapiens 


Human zinc ringer protein ZLNr 135 
mRNA, complete cds. 


5VU 




307 


gil31 83883 


Homo sapiens 


PD-Migand 2 protein (PDL2) mRNA, 
complete cds. 


1417 


99 




•■1 OCiCQ/1 1 A 


Homo sapiens 


Dutyropniim precursor u /-JJU mKiN a, 
complete cds. 


1*J1 / 


yy 


5\)f 


A AT7A1 K*) 


Homo sapiens 


Human gene 1 encoded secreted 
protein HDPPA04, SEQ ID NO:74. 


1*110 




JUo 


A/uJo /*O0 


Homo sapiens 


Human gene 22 encoded secreted 
protein fragment, SEQ ID NO:177. 




1UU 




A A DQ/IRAR 
AAoyHOOo 


nomo sapiens 


nurnan protem sequence oHv uj 
NO.16072. 




inn 
IvU 


308 


gil0436314 


Homo sapiens 


CDNAFU13984 fis, clone 
Y79AA1001846. 


383 


100 


QAQ 


A AVQ^ftO^ 

AAiojvzd 


Homo sapiens 


Human Rap2 amino acid sequence. 




55 


309 


gi4678734 


Homo sapiens 


Human gene fromPACs 37M17 and 
305B16, chromosome X, similar to 
small G proteins, especially RAP-2A. 


206 


33 


"3AQ 


A AX/TAPiO^A 
AAJVlUUyOO 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 432. 








guoyio 


nomo sapiens 


numan xntuNA ior J -ecu recepior 
abba-chain HAP50 V( a^8.2-J( alM 


■son 


inn 


310 


gil223888 


synthetic 
construct 


T cell receptor alpha chain 


586 


100 


310 


gi2358036 


Homo sapiens 


T-cell receptor alpha delta locus from 
bases 250472 to 501670 (section 2 of 
5) of the Complete Nucleotide 
Sequence. 


586 


100 


311 


AAE01596 


Homo sapiens 


Human gene 13 encoded secreted 
protein HCLCJ15, SEQ ID N0.146. 


1066 


92 


311 


AAE04136 


Homo sapiens 


Human gene 6 encoded secreted 
protein HCLBW50, SEQ ID NO: 123. 


1066 


92 


311 


gi31135 


Homo sapiens 


Rsapiens mRNA for elongation factor 
1-beta. 


1066 


92 


312 


gi7243137 


Homo sapiens 


mRNA for KIAA1378 protein, partial 


2400 


99 
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cds. 






312 


gil2314036 


Homo sapiens 


Human DNA sequence from clone 
RP3-383J4 on chromosome lo24 1- 
24.3 Contains part of a gene encoding a 
kelch motif containing nrotein. cart of a 
novel gene encoding a protein similar 
to Asnartvl-TRNA synthetase a 
putative novel gene, a 40S ribosomal 
protein S27 (RPS27) pseudogene, 2 
CpG islands, ESTs, STSs and GSSs, 
complete sequence. 


1184 


44 


312 


gi4650844 


Homo sapiens 


mRNA for Kelch motif containing 
protein, complete cds. 


1176 


44 


313 


gi7019945 


Homo sapiens 


cDNA FU20079 fis, clone COL03057. 


1610 


83 


313 


<H1 9R04791 


Wnmn QpniprtQ 

ilUUUJ aaJJit^llO 


clone MGC-2663 TMAGE 3543910 
mRNA, complete cds. 


1271 


48 


J u 


AAB43912 


Wnmn Qmvipns 


Human cancer associate*! nrotein 
seauence SEO ID NO* 1357 


1255 


45 


314 


AAB41414 


Homo saniens 


Human ORFX ORF1 178 Dolvoentide 
sequence SEQ ID NO:2356. 


5094 


97 


314 


gi6329897 


Hgmo sapiens 


mRNA for KIAA1 137 protein, partial 
cds. 


4798 


98 


314 


gi 14043759 


Homo sapiens 


clone IMAGE:41 1 1596, mRNA, 
partial cds. 


3906 


98 


315 


AAB28375 


Homo sapiens 


Human hyperpolarisation-activated 
channel HAC!3 

VI 1141 IUV1 X LTivJ « 


3686 


99 


315 


gi7959337 


Homo sapiens 


mRNA for KIAA1535 protein, partial 
cds. 


3665 


99 


315 


gi3242244 


Mus musculus 


hyperpolarization-activated cation 
channel, HAC3 


3556 


96 


316 


gil4198399 


Mus musculus 


RKEN cDNA 1500034J20 gene 


837 


93 


316 


gil2854536 


Mus musculus 


putative 


837 


93 


316 


gil4250857 


Homo sapiens 


Human DNA sequence from clone 
RP5-1 137017 on chromosome 1 1nl 2- 

J\~l -J l 1 ~> / VI / UIl will \J11LUj\JL11C> X 

14 2 Contains oart of a pene similar to 
putative mitochondrialninner 
membrane protease subnunit 2, a novel 
mRNA, ESTs and GSSs, complete 
sequence. 


775 


100 


317 


gil0439850 


Homo sapiens 


cDNA: FU23233 fis, clone 
CAS00458. ! 


1081 


50 


317 


gi9968290 


Homo sapiens 


mRNA for zinc finger protein (ZNF304 
gene). 


1039 


48 


317 


gil4249844 


Homo sapiens 


Similar to hypothetical protein 
FU23233, clone MGC: 14876 
MAGE.-3544044, mRNA, complete 
cds. 


1037 


47 


318 


gil 1863686 


Mus musculus 


neurobeachin 


3371 


96 


318 


gil 1863539 


Gallus gallus 


neurobeachin 


2100 


89 


318 


AAB92596 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 10843. 


1721 


100 


319 


gil2698174 


Macaca 
fasciculans 


hypothetical protein 


1221 


95 
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319 


gil0439153 


Homo sapiens 


cDNA: FLT22672 fis, clone HSI09265. 


1085 


99 


319 


gi7020125 


Homo sapiens 


cDNA FU20190 fis, clone COLF0714. 


893 


50 


320 


gi2865219 


Homo sapiens 


integrin binding protein Del- 1 (Dell) 
mRNA, complete cds. 


447 


100 


320 


AAW94685 


Homo sapiens 


Human Del- 1 protein. 


438 


98 


320 


AAW10365 


Homo sapiens 


Human developmentally-regulated 
endothelial cell locus- 1 protein. 


438 


98 


321 


AAB27246 


Homo sapiens 


Human EXMAD-24 SEQ ID NO: 24. 


2047 


100 


321 


AAB42385 


Homo sapiens 


Human ORFX ORF2149 polypeptide 
sequence SEQ ID NO:4298. 


2047 


100 


321 


gi52998 


Mus museums 


macrophage mannose receptor 
precursor 


164 


31 


322 


gil2834087 


Mus musculus 


putative 


1456 


82 


322 


ci2463628 


Homo sapiens 


Human putative monocarboxylate 
transporter (MCT) mRNA, complete 
cds. 


506 


•29 


322 


gi2198807 


Gallus gallus 


monocarboxylate transporter 3 


473 


27 


323 


gil5620909 


Homo sapiens 


mRNA for KIAA1925 protein, partial 
cds. 


1059 


38 


323 


AAB92496 


Homo sapiens 


Human protein sequence SEQ ID 
NO:10598. 


1050 


36 


323 


ei7021900 


Homo sapiens 


cDNA FU10065 fis, clone 
HEMBA1001455. 


1050 


36 


324 


ei9651075 


Macaca 
fascicularis 


unnamed protein product 


3716 


95 


324 


eil 5 145795 


Sus scrofa 


basic proline-rich protein 


222 


26 


324 


gi5917666 


Zea mays 


extensin-like protein 


195 


25 


325 


ei7529597 


Homo q aniens 


Human DNA seouence from clone 
RP3-402N21 on chromosome 6p21.1- 
21.31. Contains up to three novel genes 
with MAM and immunoglobulin 
domains. Contains ESTs, STSs, GSSs 
and four putative CpG islands, 
complete sequence. 


1474 


100 


325 


gil2836077 


Mus musculus 


putative 


1365 


95 


325 


AAE00586 


Homo sapiens 


Human nuclear cell adhesion molecule 
homologue, NCAM d 2 protein. 


1303 


49 


326 


gil5278193 


Homo sapiens 


MAGI-1C beta mRNA, complete cds, 
alternatively spliced 


1492 


100 


326 


gi2702351 


Mus musculus 


putative membrane-associated 
guanylate kinase 1 


1112 


83 


326 


gi5817255 


Homo sapiens 


mRNA; cDNA DKFZp434B203 (from 
clone DKFZp434B203); partial cds. 


739 


100 


327 


AAB01432 


Homo sapiens 


Human TANGO 239 (form 2). 


3675 


99 


327 


AAB01426 


Homo sapiens 


Human TANGO 239. 


2700 


100 


327 


AAB00036 


Homo sapiens 


Human TANGO 239 partial sequence. 


2483 


97 


328 


gi7243117 


Homo sapiens 


mRNA for KIAA1368 protein, partial 
cds. 


5542 


100 


328 


AAY71460 


Homo sapiens 


Human semaphorin 6A-1 . 


5422 


98 


328 


gil0187891 


Homo sapiens 


unnamed protein product 


5422 


98 


329 


gil3676461 


Macaca 
fascicularis 


hypothetical protein 


2193 


75 
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XUVllUIJ 


329 


gi4589566 


Homo sapiens 


mRNA for KIAA0961 protein, 
complete cds. 


2190 


75 


329 


gi456269 


Mus musculus 
domesticus 


zinc finger protein 30 


2073 


71 


330 


AAB94295 


Homo sapiens 


Human protein sequence SEQ ID 

INvJ. i 4 * /*f /. 


3062 


99 






— : 

Homo sapiens 


cms a ruiz/oo ns, clone 
NT2RP2001576, weakly similar to 
HYPOTHETICAL 62.2 KD PROTEIN 
C4G8.12C IN CHROMOSOME I. 


QO£9 




330 


gi7291781 


Drosophila 
melanogaster 


CG3419 gene product 


471 


32 


331 


gil2852801 


Mus musculus 


putative 


1185 


95 


331 


gil 23 14230 


Homo sapiens 


Human DNA sequence from clone 
RF5-o4orl3 on chromosome lpzl.l- 
22. 1 Contains part of the PPAP2C 
(phosphatidic acid phosphatase type 2c) 
gene, ESTs, STSs and GSSs, complete 
sequence. 


975 


100 






Homo sapiens 


CUNA rLJ/U3UU ELS, Clone rtbrOMOj. 


748 


56 


332 


gil2309630 


Homo sapiens 


Human DNA sequence from clone 

DDI 1 _A 1 ©DTI a« olirnm/uiAmA ft 

Kri on cnromosome y 
isuuiauia a novei gene ior a neuronal 
leucine-rich repeat protein, ESTs, STSs 


3138 


100 


332 


AAB31161 


Homo sapiens 


Amino acid sequence of a human 
TOLL protein. 


2600 


86 


332 


gil3444976 


Homo sapiens 


unnamed protein product 


2600 


86 


333 




coniAnc 

lLuiiiu jmyiciUi 


mrsjxri i or Ain/iuozo pro rein, paruai 
cds. 




00 


333 

* 


ci 14249936 


Homo Qanipn^ 


SimilJiT tn S-s*HRnncv1Tinmiwctf»iTip 

hydrolase-like 1, clone 

IMAGE- 3536052 mRNA. nartial cds 




inn 


333 


AAW56097 


Homo saoiens 


Amino acid seauence of the 0T}D4hS 3 
enzyme. 


2466 


84 


334 


gil3625385 


Homo sapiens 


EPI64 (EPI64) mRNA, complete cds. 


1026 


46 


334 


AAB95321 


Homo sapiens 


Human nrotein seouence SFO ID 
NO: 17577. 


1023 


50 


334 


gil0435007 


Homo sapiens 


cDNA FLJ13 130 fis, clone 
NT2RP3002972, weakly similar to 
Halocynthia roretzi mRNA for HrPET- 
1. 


1023 


50 


335 


gil5862408 


Homo sapiens 


unnamed protein product 


2255 


95 


335 


gil3272520 


Mus musculus 


pancreatitis-induced protein 49 


2021 


85 


335 


AAE02778 


Homo sapiens 


Human PRO-C-MG.64 protein encoded 
by DNA-C-MG.64-1776 cDNA clone. 


1784 


95 


336 


gil5862408 


Homo sapiens 


unnamed protein product 


2281 


99 


336 


gil 3272520 


Mus musculus 


pancreatitis-induced protein 49 


2047 


88 


336 


AAE02778 


Homo sapiens 


Human PRO-C-MG.64 protein encoded 
by DNA-C-MG.64-1776 cDNA clone. 


1810 


99 


337 


gi4545313 


Mus musculus 


prorninin-like protein 


1021 


77 


337 . 


gil5042603 


Rattus 
noivegicus 


prominin 


647 


30 
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337 


AAB94028 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 14170. 


642 


29 


338 


gi2978255 


Mus musculus 


myeloid zinc finger protein-2 


212 


42 


338 


AAB54292 


Homo sapiens 


Human pancreatic cancer antigen 
protein sequence SEQ ID NO:744. 


208 


30 


338 


gi8886436 


Homo sapiens 


myeloid zinc finger protein 1 splice 
variants (ZNF42) gene, complete cds, 
alternatively spliced. 


207 


42 


339 


gi3882269 


Homo sapiens 


mRNA for KIAA0774 protein, partial 
cds. 


5974 


99 


339 


gil2860422 


Mus musculus 


putative 


692 


96 


339 


gil5424451 


Homo sapiens 


hATTP3 


606 


36 


340 


AAB36617 


Homo sapiens 


Human FLEXHT-39 protem sequence 
SEQ ID NO:39. 


584 


100 


340 


gi8218050 


Homo sapiens 


Human DNA sequence from clone 
RP1-187J1 1 on chromosome oql 1.1- 
22.33. Contains the gene for a novel 
protein similar to S. pombe and S. 
cerevisiae predicted proteins, the gene 
for a novel protein similar to protein 
kinase C inhibitors, the 3' end of the 
gene for a novel protein similar to 
ui ubupxuid i^oz duQ pxcQicicci worm 
proteins, ESTs, STSs, GSSs and two 

sequence 


562 


100 


340 


gil3540300 


Mus musculus 


nucleolar protein C7B 


415 


66 


341 


eil4583268 


X±\JU1XJ Ou^/ivlU) 


pvtfYnlflCmip •nrntfrn mPM A rnrrmlptr 

cds. 






341 


ei2104769 


Homo sanieTK 


echiTioderm TrrieTntiihiilp-a^neiatpd 
nrotein homoloe HuEMAP mRNA 
complete cds. 


560 


65 


341 


gi4406218 


Homo sapiens 


echinoderm microtubule-associated 
protein-like EMAP2 mRNA, complete 
cds. 


495 


59 


342 


AAB60099 


Homo sapiens 


Human transport protein TPPT-19. 


1616 


93 


342 


gi7294748 


Drosophila 
melanogaster 


CG76 1 6 gene product 


580 


43 


342 


gil4714781 


Mus musculus 


RIKEN cDNA 2610005A10 eene 


441 


35 


343 


AAB94374 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14915. 


3938 


99 


343 


gi!0434690 


Homo sapiens 


cDNAFLJ12921fis, clone 
NT2RP2004600. 


3938 


99 


343 


gi5689736 


Homo sapiens 


mRNA for myopodin. 


883 


34 


344 


AAY72604 


Homo sapiens 


Human Electron Transfer Protein, 
ETRN-2. 


111 


100 


344 


gil0953950 


Geochelone 
carbonaria 


alpha-D chain hemoglobin 


407 


54 


344 


gi4455876 


Caiiina 
moschata 


alpha D-globin 


398 


53 


345 


AAY72604 


Homo sapiens 


Human Electron Transfer Protein, 
ETRN-2. 


668 


78 


345 


gil0953950 


Geochelone 


alpha-D chain hemoglobin 


359 


43 
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Accession ino. . 


opcClcS 


lscscripuon 




% 
Identity 






car Duiiana 








345 


gi4455876 


Cairina 
moschata 


alpha D-globin 


349 


41 


346 


gio6jjo6y 


Homo sapiens 


mKJNiv; Curl A UJvrZ*pJ4/L,l /O 

clone DKF2p547Cl 76). 


11/ J J 


inn 

1UU 


346 


AAr>4zU4o 


Homo sapiens 


xiuman ukta ukt ioxz porypepuQe 
sequence SEQ ID N0.3624. 




inn 

1UU 


346 


gl/29oo32 


Urosopnua 
melanogaster 


^"V"21AA/CQ nana 

c vi i uuoo gene proauct 


OU1 


Art 


347 


gil5778899 


Homo sapiens 


Similar to f-box only protein 17, clone 
MuClll I6z lMAOb:3o419Ul, mRNA, 
complete cds. 


1537 


99 


347 


gi92 80060 


Macaca 
fascicularis 


unnamed protein product 


1435 




347 


gil 52 14527 


Homo sapiens 


Similar to f-box only protein 17, clone 
compiece cos. 


857 


06 


348 


AAG64860 


Homo sapiens 


Heart muscle cell differentiation related 
procem id 01. 


1079 


90 


348 


AAB99931 


Homo sapiens 


Human MesPl protein sequence SEQ 

XL/ IN w.Ol . 


1079 


90 


348 


gil3623241 


Homo sapiens 


Similar to mesoderm posterior 1, clone 
MGC.-10676 MAGE:3944350, mRNA, 
complete cds. 


1079 


90 


1AQ 




Homo sapiens 


cnromosome iy, dal ^7470 (U1-d- 
26X23), complete sequence. 


JO/ 


1UU 


349 


gi8163824 


Homo sapiens 


krueppel-like zinc ringer protein HZF2 
iiiajn/i, compieic cus. « 


290 


74 


349 


AAY39779 


Homo sapiens 


CBMACD04 protein sequence. 


286 


71 


350 


gi7673618 


Mus musculus 


ubiquitin specific protease 


2016 


73 




gp6oy463 


Homo sapiens 


rnKJNA tor KlAA 11)63 protein, partial 
cds. 


ZUUU 


fLA 


350 


gil6198231 


Drosophila 
melanogaster 


LD43147p 


1188 


46 


351 


gil3540193 


Homo sapiens 


isopentenyl pyrophosphate isomerase 1 
(IDIl), HT009-like protein, and 
isopentenyl pyrophosphate isomerase 
type 2 (IDI2) genes, complete cds. 


1202 


100 


351 


gil392576o 


Homo sapiens 


isopentenyl diphosphate dimethylallyl 
uipno spua ic lbuniciase *. \\xjll) gene, 
exon 4 and complete cds. 


1202 


100 


351 


gil3925769 


Homo sapiens 


isopentenyl diphosphate dimethylallyl 
diphosphate isomerase 2 (IDI2) 
mRNA, complete cds. 


1202 


100 


352 


gil3561001 


Homo sapiens 


Human DNA sequence from clone 
RP1 1-528A10 on chromosome 6 
Contains an IMPDH1 (IMP (inosine 
monophosphate) dehydrogenase 1) 
pseudogene, an RNA helicase 
pseudogene, a novel gene similar to 
KIAA0161, ESTs, STSs and GSSs, 
complete sequence. 


950 


100 


352 


gil3991706 


Mus musculus 


UbcM4-mteractmg protein 4 


655 


53 
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% 
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352 


gil!36384 


Homo sapiens 


Human mRNA for KIAA01 61 gene, 
complete cds. 


651 


53 


353 


gil3561001 


Homo sapiens 


Human DNA sequence from clone 
RP1 1-528A10 on chromosome 6 
Contains an IMPDH1 (IMP (inosine 
monophosphate) dehydrogenase 1) 
pseudogene, an RNA helicase 
pseudogene, a novel gene similar to 
KIAA0161, ESTs, STSs and GSSs, 
complete sequence. 


709 


79 


353 


gi!3991706 


Mus musculus 


UbcM4-interacting protein 4 


506 


45 


353 


gil 136384 


Homo sapiens 


Human mRNA for KIAA01 61 gene, 
complete cds. 


502 


44 


354 


AAB74446 


Homo sapiens 


Human protease-mhibitor like protein. 


2759 


100 


354 


gi!2053227 


Homo sapiens 


mRNA; cDNA DKFZp434BQ44 (from 
clone DKFZp434B044); complete cds. 


2756 


99 


354 


gil5593902 


Homo sapiens 


unnamed protein product 


2743 


99 


355 


AAB94358 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14883. 


1788 


98 


355 


gil0434632 


Homo sapiens 


cDNA FLJ12886 fis, clone 
NT2RP2004041, weakly similar to 
SYNAPSINS IA AND IB. 


1788 


98 


355 


gi!2052738 


Homo sapiens 

• 


mRNA; cDNA DKFZp564H1322 
(from clone DKFZp564H1322); 
complete cds. 


1788 


98 


356 


gil3436437 


Homo sapiens 


Similar to RKEN cDNA 5730438N18 
gene, clone MGC:4399 
IMAGE:2905957, mRNA, complete 
cds. 


1634 


99 


356 


gil5030091 


Mus musculus 


Similar to RKEN cDNA 5730438N18 
gene 


1508 


91 


356 


AAB43372 


Homo sapiens 


Human ORFX ORF3136 polypeptide 
sequence S>EQ ID NO:6272. 


1464 


91 


357 


AAB73511 


Homo sapiens 


Human transferase HITS- 18, SEQ ID 
NO: 18. 


1880 


99 


357 


AAG74560 


Homo sapiens 


Human colon cancer antigen protein 

fieri it** XT/"*, a n a 

SEQ ID NO:5324. 


450 


98 


357 


AAG02792 


Homo sapiens 


Human secreted protein, SEQ ID NO: 
6873. 


324 


96 


JJO 


gl/O/ JOlo 


jvlus muscuius 


UDiquiun specuic protease 


971 1 
Z/ 11 




358 


gi5689463 


Homo sapiens 


mRNA for KIAA1063 protein, partial 
cds. 


2382 


78 


358 


gi5823525 


Drosophila 
melanogaster 


ubiquitm-specific protease nonstop 


1305 


49 


359 


AAB94775 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 15864. 


1022 


100 


359 


gil0435984 


Homo sapiens 


cDNA FLJ13842 fis, clone 
THYRO1000793. 


1022 


100 


359 


gi2340162 


Xenopus 
laevis 


dsRBP-ZFa 


380 


44 


360 


gi3676086 


bacteriophage 
PS119 


gpl9 


291 


59 


360 


gil778468 


Escherichia 


hypothetical protein 


287 


59 
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DescriDtion 
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% 
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cob" 








360 


eil786768 


Escherichia 

coliK12 


hacterioDhape lambda lvso^vme 
homolog 


287 


59 


361 


eil 3 544003 


Homo saniens 


clone IMAGE-3677 165 mRNA. 
partial cds. 


2172 


88 


361 


ci3 169073 


Schizosacchar 

UVUUA/NvvUiU 

omyces pombe 


nhenvlalanvi-tma svnthetase 
mitochondrial precursor 


233 


33 ! 


361 


eil3877969 


ArahidnrKK 

/U OUlUUUdid 

thaliana 

111*1 1 til Mill 


nntative TVhenvlalariiTie-tRN A 
svnthetase 

IT Y 111 ll^'TT^'J^' 


99R 




362 


gi293694 


Mus musculus 


laminin receptor 


370 


49 


16? 


oii1277921 


A/flic mnurnlim 


laminiTi TPCJtxtrw 1 ( 67H^ t-TKoc/mtiq! 

protein SA) 


167 

JO/ 


40 


169 


cri4611R1Q 


IViUo liiUoCUlLlo 


J /XLUa IJUIAJ1CUJJ oUUgvll 


167 
JO/ 


40 


161 




xxuiiju aapicus 


ICOlCo UCVClUpiIlCUl~I.wljll.CU AN ILrOrxl 

mRNA, complete cds. 


1R76 


inn 


JUJ 




T-Tnmn oimipnc 

IXVIUXJ SMtpiCXA2> 


clone DKFZn434H092V nartial cds 


1690 


mo 


363 


gi7294427 


Drosophila 

lllwluUVKlUlvl 


CG8797 gene product 


118 


21 


364 


AAE01355 


Homo sapiens 


Human gene 4 encoded secreted 
protein HRABV43, SEQ ID NO:77. 


2724 


97 


164 


cri 1?fl16042 


i vl Lib fJJ.UbUUJ.Uo 


puiaUVC 




01 


364 


AAE01380 


Homo sapiens 


Human gene 4 encoded secreted 
protein HRABV43, SEQ ID NO:102. 


2500 


97 


365 


gil0439688 


Homo sapiens 


cDNA: FLJ23109 fis, clone 
LNG07754. 


2809 


99 


16^ 


oi0699001 


jyius museums 


£r-caanenn uinaing proiein n i 


2. /DO 


Q7 


16S 


rV/WJU 1 / O J 


nomo sapiens 


numan secrcica protein, oily xxj inu. 
5846. , 


717 




166 


01198*5400^ 
gi i £,ojHyyj 


1V1US muse UI US 


puuiuvc 


OA A 


71 


366 


gil0241691 


Homo sapiens 


Novel human gene mapping to 

rhrfctrmcrvmp 99 


791 


99 


366 


gil4602790 


Homo sapiens 


DKFZP566F0546 protein, clone 

A/ffir , '94it4 TMAfrP*9«99^7ft mRNA 

complete cds. 


791 


99 


367 


nil 50899 R1 


X1.UJJJU SKtpivUZl 


Q-J mil cj t tf\ email rTlntaTYiinA—fi - V» 

tetratricopeptide repeat (TPR)- 
containing, clone MGC: 10496 

TMAGi c; *l69S0Q1 mPMA mmnlptp 

cds. 


79f» 


inn 
IUU 


367 


gi3377591 


Homo sapiens 


fall length insert cDNA YN88E09. 


592 


100 


367 


gil5488015 


Homo sapiens 


TPR^ontaining co-chaperone mRNA, 
complete cds. 


450 


64 


368 


gi9104819 


Xylella 
fastidiosa 9a5c 


hypothetical protein 


151 


43 


368 


AAY59981 


Homo sapiens 


Human endometrium tumour EST 
encoded protein 41. 


128 


46 


368 


AAE03351 


Homo sapiens 


Human gene 4 encoded secreted 
protein fragment, SEQ ID NO: 126. 


121 


58 


369 


gi5817053 


Homo sapiens 


mRNA; cDNA DKFZp586D0824 
(from clone DKFZp586D0824); partial 
cds. 


571 


43 


369 


gil 5530285 


Homo sapiens 


clone MGC:24275 IMAGE:3950542, 


571 


43 
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mRNA, complete cds. 






369 


gil3569476 


Mus mus cuius 


immunity-associated nucleotide 4 


340 


A1 


370 


gi8453103 


Homo sapiens 


zinc finger protein mRNA, complete 
cos. 


1296 


58 


370 


gil50l2l79 


Homo sapiens 


zinc ringer protein 16 (KOX 9), clone 
MGC.15145 IMAGE:3949487, mRNA, 
complete cds. 


1296 


CO 

58 


370 


gi498721 


Homo sapiens 


H.sapiens HZF10 mRNA lor zinc 
finger protein. 


1279 


55 


371 


gil 5929964 


Homo sapiens 


Similar to hypothetical protein 
rUlU/UZ, clone MuC:xl954 
iMAOii.43yio/i, mKCNA, complete 
cds. 


973 


1 AA 

100 


371 


AAB42330 


Homo sapiens 


ziuman ukt A UKrz J uu polypeptide 
sequence SEQ ID NO:4200. 


93/ 


1/3 


3/1 


A A "D QIAO A 


Homo sapiens 


riuman protein sequence oJtiy lu 
NO:11912. 






3/X 


gl/3zo451 


Mus musculus 


siahdase 






372 


AAB93971 


Homo sapiens 


Human protein sequence SEQ ED 

INU.14U.3o. 


866 


42 


372 


AAW73964 


Homo sapiens 


Human sialidase protein sequence. 


866 


42 




gll4oUUU5 


Mus musculus 


Zic4 protein 


1 AQf\ 


oO 


373 


AAB 14349 


Homo sapiens 


Human Zicl protein. 


1102 


67 


mi 

3 /3 


m 'i OAOvllO. 

guzuo4zy 


Homo sapiens 


mRNA for ZAc protein, complete cds. 


1 1AO 


fin 
of 


374 


gil2860114 


Mus musculus 


putative 


876 


40 


374 


gil61958 


Trypanosoma 
cruzi 


surface antigen 


ill 


23 ! 


374 


gi 1334643 


Xenopus 
laevis 


APEG precursor protein 


11 A 


26 


375 


AAY99349 


Homo sapiens 


Human PROl 1 10 (UNQ553) amino 
acid sequence SEQ ID NO:31. 


1683 


100 


375 


AAB 19729 


Homo sapiens 


Human SECX Clone 4339264-2 
encoded protein. 


1683 


100 


375 


AAB15549 


Homo sapiens 


Human immune system molecule from 
Incyte clone 2774913. 


1683 


100 


376 


gil2746394 


Homo sapiens 


CUG-BP and ETR-3 like factor 4 
(CELF4) mRNA, complete cds. 


936 


100 


376 


gil3278792 


Homo sapiens 


Bruno (Drosophila) -like 4, RNA 
binding protein, clone MGC:2693 
jjviAuii.zozuj^i, niKJNA, complete 
cds. 


911 


98 


376 


gil2804985 


Homo sapiens 


Similar to etrl, clone MGC:4320 
IMAGE:2820541, mRNA, complete 
cds. 


911 


98 


377 


gil2746394 


Homo sapiens 


CUG-BP and ETR-3 like factor 4 
(CELF4) mRNA, complete cds. 


905 


89 


377 


gil3278792 


Homo sapiens 


Bruno (Drosophila) -like 4, RNA 
binding protein, clone MGC:2693 
IMAGE:2820541, mRNA, complete 
cds. 


880 


88 


377 


gil2804985 


Homo sapiens 


Sirnilar to etrl, clone MGC:4320 
IMAGE:2820541, mRNA, complete 
cds. 


880 


88 
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378 


gil2841060 


Mus musculus 


putative 


809 


75 


378 


gi7293285 


Drosophila 
melanogaster 


CG4768 gene product 


239 


37 


378 


gil938566 


Caenorhabditis 
elegans 


Hypothetical protein C48B6.3 


123 


38 


379 


gi3880385 


Caenorhabditis 
elegans 


predicted using Genefmder-contains 
sunilarity to Pfam domain: PF01484 
(Nematode cuticle collagen N-terminal 
domain), Score=5 1 .5, E-value=6. 1 e- 1 2, 
N=l~cDNA EST yk94a4.5 comes from 
this gene-cDNA EST yk94a4.3 comes 
from mis gene-cDNA EST yk68dl.5 
comes from this gene— cDN A bSl 
yk68dl.3 comes from this gene 


79 


35 


5ly 


gl00o4 


Uaenornaouins 
elegans 


unnamed protein product 


TO 


JO 


379 


gUDOZOZ 


Caenornaoaitis 
elegans 


collagen 


/y 


JO 


a oa 
380 


AAdo53o5 


Homo sapiens 


Novel Von wuieoranavtiiromoosponn- 
like mature protein sequence. 


03/ 


QA 
94 


iou 




Homo sapiens 


Novel von wuteoranovtnromoosponn- 
like polypeptide. 


00/ 


CkA 


380 


gil2836633 


Mus musculus 


putative 


651 


59 


JO I 


gl 1 DUZ4Z04 


Mus musculus 


ribosomal protein L35a 


1 01 


OJ 


381 


gi57119 


Rattus 
norvegicus 


ribosomal protein L35a (aa 1-110) 


191 


53 


381 


gil2846322 


Mus musculus 


putative 


191 


53 


ICO 


glJzojM JJ 


Mus musculus 


putative 


01 / 


/ 1 


382 


gi7293113 


Drosophila 
melanogaster 


CG12379 gene product 


283 


72 


382 


gi6042159 


Caenorhabditis 
elegans 


Hypothetical protein F53A3.7 


226 


55 




AArfolUDJ 


Homo sapiens 


TJ m ■ .»_■» i-i m. w*^^^*** T f I Mi*\ t d£ A f\ J-L isms*.* «X ji. a y% J jjf 

riuman protein jiruio4U ammo acia 
sequence. 




inn 
1UU 


383 


Ril2841896 


Mus musculus 


putative 


925 


98 


jOj 


gl/JUJl^Ht 


Drosophila 
melanogaster 


vajiuidj gene proauct 


olz 




•a cm 




Homo sapiens 


rnKXNA ior rLJuvvj.z protein, partial 
cds. 


flAK 
IDHJ 


0^ 

yj 


384 


oil 0440396 1 

til X VrTrU*/ ^ \J 


X HJLLLL) OtlLf l Lr> 


mRNA fnr FI J0003 1 nroteiir narHal 

cds. 


647 


88 


384 


gil086626 


Caenorhabditis 
elegans 


Hypothetical protein C06A63 


273 


33 


385 


gil2053305 


Homo sapiens 


mRNA; cDNA DKFZp434G099 (from 
clone DKFZp434G099); complete cds. 


1210 


100 


385 


gi2516239 


Mus musculus 


Rab33B 


1138 


94 


385 


gil2836564 


Mus musculus 


putative 


1138 


94 


386 


gi7243247 


Homo sapiens 


mRNA for KIAA1433 protein, partial 
cds. 


3232 


100 


386 


AAB94053 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14222. 


3223 


99 


386 


gil3096872 


Mus musculus 


Unknown (protein for MGC7720) 


2906 


89 


387 


gil4599491 


Homo sapiens 


small proline-rich protein 2F (SPRR2F) 


458 


100 
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gene, complete cds. 






387 


gil4599489 


Homo sapiens 


small proline-rich protein 2E 
(SPRR2E) gene, complete cds. 


444 


95 


387 


gi338423 


Homo sapiens 


Human small proline rich protein 
(sprll) mRNA, clone 930. 


434 


94 


388 


gi60 10699 


Rattus 
norvegicus 


F-box protein FBL2 


1449 


99 


388 


gil4043139 


Homo sapiens 


RKEN cDNA 261051 1F20 gene, 
clone MGC: 15482 IMAGE:2987858, 
mRNA, complete cds. 


1383 


100 


388 


gil2848653 


Mus mnscuhis 


putative 


1371 


99 


389 


gi2853265 


Rattus 
norvegicus 


jun dimerization protein 2 


800 


96 


389 


gil2248392 


Mus museums 


transcriptional inhibitory factor 


795 


95 


389 


gi6648146 


Homo sapiens 


chromosome 14 clone CTD-2317F5 
map 14q243, complete sequence. 


481 


100 


390 


gil5277240 


Homo sapiens 


genomic DNA, chromosome 6p21.3, 
HLA Class I region, section 17/20. 


1296 


100 


390 


gil 1875405 


Homo sapiens 


HZFwl protein mRNA, complete cds. 


1291 


99 


390 


gil 1875407 


Homo sapiens 


HZFw2 protein mRNA, complete cds. 


773 


99 


391 


gi6572201 


Homo sapiens 


Human DNA sequence from clone 
CITF22-27C3 on chromosome 
22ql3. 1-1331 Contains a gene for a 
novel protein (DJ1 163J1 .2) and part of 
a gene for a novel protein (DJ 1 1 63 Jl .3, 
similar to mouse B99), ESTs, STSs and 
GSSs, complete sequence. 


863 


100 


391 


gi4469186 


Homo sapiens 


Human DNA sequence from clone 
RP5-1 163J1 on chromosome 22ql3.2- 
13.33 Contains the 3* part of a gene for 
a novel K1AA0279 LIKE EGF-like 
domain containing protein (similar to 
mouse Celsrl, rat MEGF2), a novel 
gene for a protein similar to C. elegans 
B0035.16 and bacterial tRNA (5- 
Memylammomethyl-2-thiouridylate)- ' 
Methyltransferases, and the 3' part of a 
novel gene for a protein similar to 
mouse B99. Contains ESTs, GSSs and 
putative CpG islands, complete 
sequence. 


863 


100 


391 


AAB92551 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 10735. 


862 


96 


392 


gi5001720 


Mus museums 


odd-skipped related 1 protein 


1413 


97 


392 


gil5778246 


Mus musculus 


odd-skipped related 2 


924 


66 


392 


gil5488723 


Mus musculus 


Unknown (protein for MGC:19171) 


924 


66 


393 


AAB94364 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 14895. 


2700 


99 


393 


gil0434650 


Homo sapiens 


cDNA FU12895 fis, clone 
NT2RP2004187, weakly similar to 
ZINC FINGER PROTEIN 38. 


2700 


99 


393 


gil3623217 


Homo sapiens 


Similar to hypothetical protein 
FLJ12895, clone IMAGE:3533093, 


2150 


99 
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mRNA, partial cds. 






394 


gil2053105 


Homo sapiens 


rnRNA; cDNA DKFZp434Kl 1 1 (from 
clone DKFZp434Kl 1 1); complete cds. 


3116 


100 


394 


gi2282582 


Mus musculus 


actin-binding protein 


2402 


74 


394 


AAR94386 


Homo sapiens 


Human neural cell protein marker 
RR/B. 


2400 


74 


395 


gi207145 


Rattus 
norvegicus 


synaptotagmin II 


2128 


95 


395 


017730733 


I'lUiJ IlllinUUIUJ 


cvnantntaomiTi TT 


2121 




395 


gi688412 


Mus musculus 


synaptotagniinII/n > 4BP 


2121 


95 


J7U 


cnlS4R7fi74 ' 
giuto/u /•* 


T-Trtmfk C2>T\1/*T1C 
XJUJ11U CvdpjCIlD 


V70J31 -IClalCU £JIUICU1 i 

complete cds. 




00 
yy 




AARQ261 1 


nuuiu oajJicm> 


T-Ii * rri o j\ nrntpm cpnupnrp CX7fi III 

■ mil \t\ 1 1 UlillCiU dCi^UCllUC OUy 1J-/ 

NO:10880. 


7fV* 


ion 




A A V07901 
rv/\ 17/ &7X 


nuiusj bapiCuo 


l^ipiU aSSU waicu piUlClH [IS FnT J 

2764333CD1. 


/Kjj 


inn 


107 


oil i?^inRs 

gll 1 I/O J 


fascicularis 


njfpoujcucai pr oiem 




/O 


397 


gi2447128 


Paramecium 

faiircftrin 
Chlnrella vims 
1 


contains 10 ankyrin-like repeats; 

annual iu umiiaii au&.yim, uuricopuuuo 

P16157 


212 


33 


397 


gi6634025 


Homo sapiens 


mRNA for KIAA0379 protein, partial 
cds. 


203 


38 


398 


AAB21047 


Homo saDiens 


Human nucleic acidf-liindinp tirotein 
NuABP-51. 


1082 


ion 


398 


gi833629 


Xenopus 
laevis 


nucleoplasmin 


459 


49 


398 


gi64940 


Xenopus 
laevis 


nucleoplasmin (AA 1-200) 


435 


46 


399 


gil5919272 


Homo sapiens 


putative forkhead/winged-helix 
transcription factor (FOXP2) mRNA, 
complete cds. 


596 


84 


399 


gi2565057 


Homo sapiens 


CAGH44 mRNA, partial cds. 


596 


84 


399 


gi 14582802 


Mus musculus 


forkhead-related transcription factor 2 


588 


82 


400 


AAB08199 


Homo sapiens 


Amino acid sequence of human 
diacylglycerol kinase beta 
(DAGKbeta). 


4217 


99 


400 


gi!0279722 


Homo sapiens 


unnamed protein product 


4217 


99 


400 


gi485398 


Rattus 
norvegicus 


90kDa-diacylglycerol kinase 


4046 


95 


401 


gi7670446 


Mus musculus 


unnamed protein product 


1295 


87 


401 


gil3185203 


Homo sapiens 


unnamed protein product 


799 


83 


401 


AAY31642 


Homo sapiens 


Human transport-associated protein-4 
(TRANP-4). 


466 


35 


402 


gil2837990 


Mus musculus 


putative 


985 


69 


402 


gi5668737 


Mus musculus 


UBE-lc2 


661 


50 


402 


AAB94645 


Homo sapiens 


Human protein sequence SEQ ID 
NO:15538. 


426 


52 


403 


gil0439821 


Homo sapiens 


cDNA: FU23209 fis, clone 
ADSH00512. 


2596 


99 


403 


gil0440353 


Homo sapiens 


mRNA for FLJO00 1 1 protein, partial 


1448 


97 



183 



WO 02/081731 



PCT/US02/01222 



Table 2 



SEQ ID NO: 


Accession No. 


Species 


Description 


Score 


% 
Identity 








cds. 






403 


gi8217420 

— 


Homo sapiens 

■ 


Human DNA sequence from clone 
RP1 1-108L7 on chromosome 10. 
contains part of the gene for a novel 
Insulin-like growth factor binding type 
protein with Kazal-type serine protease 
inhibitor domain, the gene for a novel 
protein similar to rat tricarboxylate 
carrier, the gene for a novel PDZ 
(DHR, GLGF) domain protein, the 
gene for a novel protein similar to 
KIAA0552, KIAA0341 andFugu 
hypothetical protein 2, the gene for a 
novel protein similar to Plasmodium 
POM1 and C. elegans F46G1 1 .1, a 
putative novel gene, the SEMA4G gene 
for semaphorin 4G and a novel gene, 
contains cols, oias, Obos ana seven 
putative CpG islands, complete 
sequence. 


1026 


100 


404 






numan uivr.A. \js\r iyoo poiypcpuae 
sequence SEQ ID NO:3966. 




yo 


404 


ai34172Q7 


XIUIUU dApiCUb 


nunian omos ome x o dav/ cione 
C1T987SK-A-635H12, complete 


I'll A 


ojc 
yo 


404 


gil5559282 


Homo sapiens 


clone MGC:20208 IMAGE:3936339, 

UUUiA, WUlXJ£JiGLC Mid! 


1021 


53 


405 


gil3365905 


Macaca 
fascicularis 


hypothetical protein 


1154 


99 


405 


AAB15537 


Homo sapiens 


Human immune system molecule from 
Incvte clone 27^1 12Q 


911 


100 


405 


AAE04891 


Homo sapiens 


Human transporter and ion channel-4 
(TRICH-4^ nrotein 


360 


39 


406 


^i262843 


Rattus sp. 


TiMiTntTanQmittPT trancnnrlpr 

UvUlUUflliailUllCl UCLLfo|J\JXLCl 




yo 


406 


gi545078 


Rattus so 


^J<»+/f^ 1 ( pry H pn ♦ npiirntrancmi'ttpr 

tranSDorter 




yo 


406 


AAR88390 


Homo sapiens 


Human neurotransmitter transporter 
protein. 


3668 


96 


407 


AAB31212 


Homo sapiens 


Amino acid sentience of human 
polypeptide PRO6004. 


728 


100 


407 


AAB44331 


Homo sapiens 


Human PR04993 protein sequence 
SEQIDNO:612. 


111 


100 


407 


gi4519558 


Rattus 
norvegicus 


Kilon 


667 


94 


408 


gil5277972 


Mus musculus 


Similar to DnaJ (Hsp40) homolog, 
subfamily B, member 1 


808 


49 


408 


gi7804472 


Mus musculus 


heat shock protein 40 


808 


49 


408 


AAB72675 


Homo sapiens 


Human HDJ1. 


804 


48 


409 


gil2841015 


Mus musculus 


putative 


798 


52 


409 


AAB60114 


Homo sapiens 


Human transport protein TPPT-34. 


787 


51 


409 


gil3435410 


Mus musculus 


SiriiilartoRIKENcDNA 1810012H11 
gene 


768 


53 


410 


pi488555 


Homo sapiens 


Human zinc finger protein ZNF135 


1241 


52 
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mRNA, complete cds. 






410 


AAY73346 


Homo sapiens 


HTRM clone 619699 protein sequence. 


1238 


49 


410 


AAB43912 


Homo sapiens 


Human cancer associated protein 
sequence SEQ ID NO: 1357. 


1231 


49 


411 


gi837292 


Rattus 
norvegicus 


S 1 00A 1 gene product 


278 


59 


411 


AAB45531 


Homo sapiens 


Human S100A1 protein. 


274 


57 


411 


gil 1228039 


Homo sapiens 


S100A1 cDNA 


274 


57 


412 


AAB19851 


Homo sapiens 


Human muscle-specific protein Ozz. 


1504 


100 


412 


gil3929456 


Homo sapiens 


Human DNA sequence from clone 
RP3-337018 on chromosome 20qI2- 
13. 1 . Contains the PLPT gene encoding 
Phospholipid Transfer Protein, the 
PPGB gene coding for Lysosomal 
Protective Protem precursor (EC 
3.4.16.5, CathepsinA, 
Carboxypepudase C) and the gene 
encoding peroxisomal acyl-CoA 
thioesterase (PTE1, thioesterase II), 
four novel genes, the gene for a novel 
protein similar to Drosophila 
Neuralized (Neu) and the 5' end of an 
isofonn of the TNNC2 gene for fast 
troponin ui. i/oniaiiis mree i^pvj 
islands, ESTs, STSs and GSSs, 


1504 


100 


412 


gil2835750 


Mus musculus 


putative 


1328 


89 


413 


gil2847182 


Mus musculus 


putative 


875 


87 


*t i j 


glHOOtl ID 


nomo sapiens 


mKISA, CtiXNA JUJSxZ-p DO^KJUyoZ 

^JiUJUi l/JUilu 3^I\^ Z^^JjvKr\J\JyOJj J 9 polilxU . 

cds. 




10U 


413 


cil0047333 




mRNA for TCT A A 1 698 nrntein nartial 

cds. 




49 


414 


gi7959343 


Homo sapiens 


mRNA for KIAA1538 protein, partial 
cds. 


3286 


100 


414 


AAB42721 


Homo sapiens 


Human ORFX ORF2485 polypeptide 
seouence SEO ID NO*4970 


382 


100 


414 


AAB42764 


Homo saniens 

AAVLUU MWlVlU 


Human ORFX ORF2528 nolvnentide 
sequence SEQ ID NO:5056. 


355 


41 


415 


gil4043332 


Homo sapiens 


Similar to ring ringer protein 23, clone 
MGC.2475 IMAGE:3051389, mRNA, 
complete cds. 


1006 


43 


415 


gil0716078 


Mus musculus 


testis-abundant finger protein 


995 


42 


415 


gil0716076 


Homo sapiens 


mRNA for testis-abundant finger 
protein, complete cds. 


966 


40 


416 


gi3599509 


Mus musculus 


rho/rac-interacting citron kinase 


1507 


61 


416 


gi3360512 


Rattus 
norvegicus 


Citron-K kinase 


1505 


89 


416 


gi3599507 


Mus musculus 


rho/rac-interacting citron kinase short 
isofonn 


1503 


89 


417 


gi2358070 


Mus musculus 


trypsinogen 1 


898 


65 


417 


gi603903 


Gallus g alius 


trypsinogen 


408 


36 


417 


gi65163 


Xenopus 


trypsin precursor 


405 


38 
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laevis 








418 


<H44M27 


norvegicus 


f*j* liTrt o 1 vr n n 


11 V) 


87 
o / 


AIR 


A AR447^A 


nomo Sapiens 


fiuiiiao rJXKJ f\JJ y*Jir4\£j\)yj p rule in 
sequence SEQ ID NO: 109. 


^7fl 


*rU 


41 R 
lio 




xiomo Sapiens 


T-Tiimon CXT^C* 1 /^ r\Tt\te*\T\ 

Human ojr^o protein. 


^7fi 
D /U 


41? 
HO 


419 


AAM06489 


Homo sapiens 


Human foetal protein, SEQ ID NO: 
220. 


376 


82 


41 O 

419 


gil2oJj3/o 


Mus musculus 


putative 


230 


31 


4 in 

419 


AAIiOZOoo 


Homo sapiens 


Human four disulfide core domain 
(FDCD^ontaining protein. 


222 


31 


420 


AAB42561 


Homo sapiens 


Human ORFX ORF2325 polypeptide 
sequence SEQ ID NO:4650. 


CMC 

5075 


100 


420 


gi54 19865 


Homo sapiens 


mRNA; cDNA DKFZp434N074 (from 
clone DKFZp434N074). 


5070 


99 


420 


gi45 89532 


Homo sapiens 


T»VT A A?**, try A A nn j /• _ . • . ; -1 

mRNA for KIAA0944 protem, partial 
cos. 


3375 


61 


421 


gil0438804 


Homo sapiens 


cDNA: FU22419 fis, clone 

i_rr> /"v\o cm 


1026 


60 


421 


gi!3938187 


Homo sapiens 


hypothetical protein FLJ22419, clone 
Mut: 14900 lMAU.b: J 34 /7o3, mKNA, 
compieie cos. 


1026 


60 


421 


gi6690339 


Mus musculus 


hematopoietic zinc ringer protein 


717 


47 


477 


A A 13047*7 1 


riomo sapiens 


Human protein sequence SEQ ID 
NO:15739. 


1£7fi 
10/6 


GO 




<ri1 fi43S7£4 


xiomo sapiens 


cuss a t*LJ looyo ns, clone 
PLACE2000111. 


10 /o 


oo 
99 


422 


gi5706454 


Homo sapiens 


mRNA for Natural killer cell p44 
related gene 2 (NKp44RG2). | 


158 


29 


423 


gil 5026974 


Homo sapiens 


mRNA for obscurin (OBSCN gene). 


2713 


96 


423 


AAB95162 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 17205. 


1173 


86 




gil 39381 /0 


Homo sapiens 


clone IMAGE:2961284, mRNA, 
partial cds. 


540 


26 


474 


gli/o01304 


Mus musculus 


putative 


CIO 


51 


424 


AAE02058 


Homo sapiens 


Human four disulfide core domain 
(rDLDj-contaiiiing protem. 


485 


38 


424 


gil2655452 


Homo sapiens 


mRNA for keratin associated protein 

A H /XrT>nr ADA *7 -A 

4. / (KKTAr4.7 gene). 


485 


40 


425 


eil2830335 


xiuiuu oapiciio 


RP1 1-550O8 on chromosome 20. 
Contains a novel gene encoding a 
protein kinase, an RPL7 (60S 
Ribosomal Protein L7) pseudogene, a 
CpG island, ESTs, STSs and GSSs, 
complete sequence. 




00 


425 


AAB65688 


Homo sapiens 


Novel protein kinase, SEQ ID NO: 216. 


1732 


100 


425 


AAB65690 


Homo sapiens 


Novel protem kinase, SEQ ID NO: 218. 


1184 


69 


426 


gi388518 


Homo sapiens 


HiimanV beta 5.5 mRNAforanewT 
cell receptor. 


627 


95 


426 


gi36173 


Homo sapiens 


H.sapiens rearranged T-cell receptor 
beta chain mRNA. 


613 


94 


426 


gil552509 


Homo sapiens 


Human gennline T-cell receptor beta 


606 


100 
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chain TCRBV13S1, TCRBV6S8A2T, 
TCRBV5S6A3N2T, TCRBV1 3S6A2T, 
TCRBV6S9P, TCRBV5S3A2T, 
TCRBV13S8P, TCRBV6S3A1N1T, 
TCRBV5S2, TCRBV6S6A2T, 
TCRBV5S7P, TCRBV13S4, 
TCRBV6S2A1N1T, TCRBV5S4A2T, 
TCRBV6S4A1, TCRBV23S1A2T, 
TCRBV12S1A1N2, TCRBV212>2A2, 
TCRBV8S1, TCRBV8S2A1T, 
TCRBV8S3, TCRBV16S1A1N1, 
TCRBV24S1A3T, TCRBV25S1A2PT, 
TCRBV26S1P, TCRBV18S1, 

TCRBV10S1P genes from bases 
257519 to 472940 (section 2 of 3). 






427 


AAbU4 ijZ 


Homo sapiens 


Human beta- 1 ,3 -galactosyltransferas e 
homologue, ZNSSP8. 


A1A 


11 i 


427 


gn45975JJ 


Homo sapiens 


unnamed protein product 


A1A 
4,54 




427 


gil4039836 


Homo sapiens 


beta 1,3 N- 

acetyglucosaminyltransferase Lc3 
synthase mRNA, complete cds. 


434 


33 


428 


•rnri A** 


Homo sapiens 


Human proteasome subumt LMP7 
^allele i-dYLr '*-') iiixviN/\, coiupieic cus. 


OZo 




428 


gi38482 


Homo sapiens 


H.sapiens gene for major 

nictn/*AnmQfinilifv n ih mov 0ti/*/viA/l 
lUMUOUUipdLlULLUjr wUUipiCA CllvUUCU 

proteasome subunit LMP7. 


624 


49 




tri 1 ftS A1A1 
glllO**/**/ 


xiomu sapiens 


ri.SapicnS UlrLrX, JL^iVLD, IXL«/\-Z>l, Ixr/, 
T MP? TAP1 TMP7 TAP? DOR 
DQB2 and RING8, 9, 13 and 14 genes. 






429 


AAG71415 

AAVJ / ITU 


Wnino <ranif*n^ 

i ikjiluj aapu/iio 


Human nlfartrvrv TWPntnr ■nnlvnenfiH^ 

SEO ID NO- 1096 


1587 


100 


429 


AAG71594 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEO ID NO* 1275 


1344 


83 


429 


AAG72476 


Homo sapiens 


Human OR-lflce polypeptide query 
seauence SEO ID NO* 2157 


1011 


100 


430 


gil0440063 


Homo sapiens 


cDNA: FU23392 fis, clone HEP17418. 


3045 


100 


430 


&il5214571 


1VJ.LLD IIUIOVULIUO 


T Tnlrnn wn ^ nrntpin fnr 

MAGE.4207025) 


2396 


80 


430 


gi!770528 


Homo sapiens 


H.sapiens mRNA for trans lin 
associated zinc finger protein- 1. 


687 


38 


431 


gil2859929 


Musmuscuhis 


putative 


917 


96 


431 


gil5207935 


Macaca 
fascicularis 


hypothetical protein 


301 


96 


431 


gil655637 


Mus mnsculus 


orf 


147 


27 


432 


gi4585414 


Bacteriophage 
933W 


hypothetical protein 


408 | 


42 


432 


gi4499798 


Bacteriophage 
933W 


orfl5; homologous to ninG gene 


408 


42 


432 


gi5881629 


Bacteriophage 
VT2-Sa 


hypothetical protein 


408 


42 


433 


gil3161184 


Homo sapiens 


cytochrome P450 2S1 (CYP2S1) 
mRNA, complete cds. 


2615 


100 
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433 


AAB93056 


Homo sapiens 


Human protein sequence SEQ ID 
NO:11860. 


2527 


100 


433 


gil4042396 


Homo sapiens 


cDNA FU14699 fis, clone 
NT2RP2006571, moderately similar to 
CYTOCHROME P450 2G1 (EC 
1.14.14.1). 


2527 


100 


434 


gil3445575 


Homo sapiens 


facilitative glucose transporter 
GLUT10 (SLC2A10) mRNA, complete 
cds. 


2752 


99 


434 


gil3603727 


Homo sapiens 


glucose transporter (GLUT10) mRNA, 
complete cds. 


2752 


99 


434 


gil 1065680 


Homo sapiens 


Novel human gene mapping to 
chromosome 20, similar to membrane 
transporters. 


2752 


99 


435 


gil3310486 


Homo sapiens 


C2H2 zinc finger protein (SALL3) 
gene, complete cds. 


6094 


99 


435 


gi6688241 


Homo sapiens 


SALL3 gene, exons la, 2 and 3. 


6070 


99 


435 


^il296845 


Musmusculus 


spalt protein 


5089 


84 


436 


AAG71445 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1126. 


1312 


85 


436 


AAG71447 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1128. 


924 


61 


436 


gil5293797 


Homo sapiens 


clone OR6M1 olfactory receptor gene, 
partial cds. 


829 


78 


437 


AAB65297 


Homo sapiens 


Human PR09828 protein sequence 
SEQIDNO:511. 


1360 


100 


437 


AAG89178 


Homo sapiens 


Human secreted protein, SEQ ID NO: 
298. 


1360 


100 


437 


AAB84652 


Homo sapiens 


Amino acid sequence of fibroblast 
growth factor homologue zFGF12. 


1360 


100 


438 


gi53756 


Mus museums 


rninopontin precursor (AA -66 to 272) 


1521 


100 


438 


gi297546 


Mus musculus 


osteopontin 


1516 


99 


438 gi50864 


Mus musculus 


T lymphocyte activation protein 


1514 


99 
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entry ID 


Description 


♦Results 


1 


PF00204 


Zinc finger C-x8-C-x5-C-x3-H type 
(and similar). 


PF00204 11.59 9.700e-12 426-437 


1 


BLG0518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


BL00518 12.23 3.667e-09 33-42 


2 


BL00291 


Prion protein. 


BL00291A4.49 8.759e-09 185-220 


3 


PF01105 


emp24/gp25IVp24 family. 


PF01105B 25.12 l.OOOe^M) 178-230 


4 


BL00307 


Legume lectins beta-chain proteins. 


BL00307G 9.91 8.531e-10 678-689 


4 


PF00922 


Vesiculovirus phosphoprotein. 


PF00922A 19.17 8.862e-09 281-315 


6 


BL01159 


WW/rsp5 /WW? domain proteins. 


BL01159 13.85 6.073e-09 61-76 


6 


BL00591 


Glycosyl hydrolases family 10 proteins. 


BL00591G 9.65 9.167e-09 311-323 


7 


BL01159 


WW/rsp5/WWP domain proteins. 


BL01159 13.85 6.073e-09 61-76 


7 


BL00591 


Glycosyl hydrolases family 10 proteins. 


BL00591G9.65 9.167e-09 311-323 


9 


BL00913 


Iron-containing alcohol dehydrogenases 
proteins. 


BL00913D 24.20 8.981e-17 170-204 
BL00913C 7.62 4.375e-l 1 136-146 
BL00913B 10.94 7.706e-ll 86-102 


10 


BL00913 


Iron-containing alcohol dehydrogenases 
proteins. 


BL00913D 24.20 8.981e-17 218-252 
BL00913C 7.62 4.375e-ll 184-194 
BL00913B 10.94 7.706e-ll 134-150 


11 


BL50062 


BCL2-like apoptosis inhibitors (spans 
part of BH3, BH1 and BH. 


BL50062C 6.66 8.500e-l 1 349-358 


14 


BL01144 


Ribosomal protein L3 le proteins. 


BL01144 25.07 9.069e-26 78-130 


15 


PF00204 


Zinc finger C-x8-C-x5-C-x3-H type 
(and similar). 


PF00204 11.59 6.694e-10 355-366 


15 


BL00904 


Protein prenyltransferases alpha subunit 
repeat proteins proteins. 


BL00904A 8.30 4.000e-09 485-535 


15 


BL00415 


Synapsins proteins. 


BL00415N 4.29 6.727e-12 483-527 
BL00415N 4.29 2.774e-09 118-600 
BLO0415P 237 4.290e-09 819-855 
BL00415Q 2.23 6.534e-09 474-510 


15 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 4.500e-14 490-505 
PR00049D 0.00 2.500e-12 489-504 
PR00049D 0.00 4.000e-12 491-506 
PR00049D 0.00 8.201e-ll 488-503 
PR00049D 0.00 1.205e-10 492-507 
PR00049D 0.00 3.746e-09 487-502 
PR00049D 0.00 5.271e-09 485-500 
PR00049D 0.00 6.644e-09 493-508 


15 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 9.022e-13 471-504 
DM00215 19.43 1.458e-09 483-516 
DM00215 19.43 2.678e-09 469-502 
DM00215 19.43 5.424e-09 468-501 
DM00215 19.43 8.017e-09 470-503 
DM00215 19.43 9.085e-09 466-499 . 
DM00215 19.43 9.237e-09 484-517 


15 


BL01113 


Clq domain proteins. 


BL01113A 17.99 9308e-09 116-143 \ 


15 


BL00048 


Protamine PI proteins. 


BL00048 6.39 5.263e-10 196-223 BL00048 
6.39 3.363e-09 262-289 BL00048 6.39 
9.112e-09 184-211 


17 


PR00773 


GRPE PROTEIN SIGNATURE 


PR00773D 16.14 5.922e-09 215-235 
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♦Results 


23 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 7.300e-26 600-203 
PD00930A 25.62 1.514e-16 497-523 


23 


BL50002 


Src homology 3 (SID) domain proteins 
profile. 


BL50002A 14.19 4.000e-12 727-746 


23 


PP00182 


GTPase-activator protein for Rho-like 
GTPases 


PF00182B 14.20 7J33e-12 549-128 


25 


BL00375 


UDP-glycosyltransferases proteins. 


BL00375F 16.99 7.061e-35 291-336 
BL00375C 18.27 2.615e-19 126-150 
BL00375D 14.56 9.000e-17 192-220 
BL00375B 21.22 8.627e-16 67-108 
BL00375G 13.01 4.577e-13 390-430 


28 


BL01170 


Ribosomal protein L6e j>roteins. 


BL01170A 12.34 9.143e-40 139-175 ! 


28 


PD01457 


RIBOSOMAL PROTEIN 40S ZINC- 
FINGER METAL. 


PD01457A 16.51 9.845e-09 67-112 


29 


BL00359 


Ribosomal protein LI 1 proteins. 


BL00359B 23.07 4.231e-24 56-97 
BL00359C 22.18 6.148e-22 111-145 
BL00359A 20.66 4.000e-21 20-56 


29 


BL01108 


Ribosomal protein L24 proteins. 


BL01108A 20.33 1.000e-08 40-73 


30 


PR00983 


CYSTEINYL-TRNA SYNTHETASE 
SIGNATURE 


PR00983D 14.16 3209e-23 270-292 
PR00983C 1 1.27 3.415e-21 239-258 
PR00983A 11.10 1.878e-12 75-87 


30 


BL00178 


Aminoacyl-transfer RNA synthetases 
class-I proteins. 


BL00178B 7.11 2.286e-09 314-325 


31 


PR00718 


PHOSPHOLIPASE D SIGNATURE 


PR00718E8.61 1.000e-08 327-351 


32 


BL00518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


BL00518 12.23 6.133e-l 049-58 


33 


PF00992 


Troponin. 


PF00992A 16.67 7.972e-10 10-45 PF00992A 
16.67 5.145e-09 17-52 PF00992A 16.67 
6.684e-09 56-91 


34 


BL01019 


ADP-ribosylation factors family 
proteins. 


BL01019A 13.20 8.000e-ll 68-108 


34 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17.27 4.938e-20 75-98 
PR00449A 13.20 1.900e-15 34-56 
PR00449E 13.50 6.870e-15 173-196 
PR00449B 14.34 1.360e-10 57-74 
PR00449D 10.79 5.364e-09 137-151 


37 


PR00764 


COMPLEMENT C9 SIGNATURE 


PR00764F 16.89 7.783e-ll 204-225 


37 


DM01077 


SEX HORMONE-BINDINU 
GLOBULIN. 


JJiVlUlU / / A 10. jU l.XOJC-lw Hj-y\J 


37 


BL00279 


Membrane attack complex components 
/perforin proteins. 


BL00279E 37.11 9.163e-09 187-235 


38 


PR00832 


PAXUXIN SIGNATURE 


PR00832B 9.87 6.284e-10 768-792 


38 


PR00806 


VINCULIN SIGNATURE 


PR00806A 6.63 9.260e-09 766-777 


38 


PR00049 


WELM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049P 0.00 8.661e-15 766-781 
PR00049D 0.00 3.250e-12 764-779 
PR00049D 0.00 7277e-ll 765-780 
PR00049D 0.00 8.786e-10 763-778 
PR00049D 0.00 9390e-O9 762-777 


40 


BL00226 


Intermediate filaments Proteins. 


BL00226D 19.10 3.172e-34 397^44 
BL00226B 23.86 5.929e-23 230-278 
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BL00226C 13.23 4.808e-21 296-327 
BL00226A 12.77 5.065e-13 129-144 
BL00226B 23.86 6.400e-10 181-229 


41 


BL00243 


Integrins beta chain cysteine-rich 
domain proteins. 


BL002431 31.77 2.014e-09 156-199 
BL002431 31.77 5.437e-09 159-202 
BL002431 31.77 5.690e-09 30-73 


41 


BL01208 


VWFC domain proteins. 


BL01208B 15.83 5.865e-09 184-199 


41 


BL00203 


Vertebrate metallothioneins proteins. 


BL00203 13.94 3.670e-ll 66-112 BL00203 
13.94 4.659e-l 140-86 BL00203 13.94 
7.429e-ll 70-116 BL00203 13.94 9.505e-ll 
140-186 BL00203 13.94 2. 723e-l 021-67 
BL00203 13.94 2.723e-10 61-107 BL00203 
13.94 3.147e-10 105-151 BL00203 13.94 
4.064e-10 22-68 BL00203 13.94 521 3e-10 
161-207 BL00203 1354 6.457e- 10 26-72 
BL00203 13.94 7.032e-10 184-230 BL00203 
13.94 7.223e-10 80-126 BL00203 13.94 
9.043e-10 130-176 BL00203 13.94 1.735e- 
09 175-221 BL00203 13.94 3.020e-O9 150- 
196 BL00203 13.94 3. 204e-09 65-111 
BL00203 13.94 3.296e-09 95-141 BL00203 
13.94 3.663e-09 135-181 BL00203 13.94 
5.041e-09 47-93 BL00203 13.94 5.04 le-09 
85-131 BL00203 13.94 5.500e-09 100-146 
BL00203 13.94 5.867e-09 126-172 BL00203 
13.94 5.959e-09 90-136 BL00203 13.94 
6.694e-09 170-216 BL00203 13.94 6.878e- 
09 151-197 BL00203 13.94 6.969e-09 17-63 
BL00203 13.94 7.337e-09 115-161 BL00203 
13.94 7.429e-09 71-117 BL00203 13.94 
7.704e-09 171-217 BL00203 13.94 8.531e- 
09 155-201 BL00203 13.94 8.714e-09 165- 
211 BL00203 13.94 9.265e-09 116-162 


41 


BL00269 


Mammalian defensins proteins. 


BL00269C 16.52 9.289e-09 28-57 
BL00269C 16.52 9.289e-09 72-101 


41 


PD02283 


PROTEIN SPORULATION REPEAT 
PRECU. 


PD02283C 17.54 5.050e-O9 138-166 
PD02283C 17.54 5.175e-09 24-52 

PD02283C 17.54 6.738e-09 113-141 
PD02283C 17.54 7.188e-09 163-191 
PD02283C 17.54 7.750e-09 173-201 
PD02283C 17.54 7.975e-09 128-156 
PD02283C 17.54 8.650e-09 148-176 
PD02283C 17.54 9.325e-09 118-146 


41 


BL00799 


Granulins proteins. 


BL00799D 12.41 7.661 e-09 49-96 
BL00799G9.41 1.000e-08 39-80 


43 


BL00291 


Prion protein. 


BL00291 A 4.49 4.414e-09 47-82 


44 


PF00084 


Sushi domain proteins (SCR repeat 
proteins. 


PF00084B 9.45 7.188e-10 1549-1561 


44 


BL00142 


Neutral zinc metallopeptidases, zinc- 


BL00142 8.38 2.286e-09 730-741 
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binding region proteins. 




44 


PR00480 


ASTACIN FAMILY SIGNATURE 


PR00480B 15.41 3.3 14e-09 725-744 


45 


BL00414 


Profilin proteins. 


BL00414D 15.59 9. 182e-10 81-108 


48 


PR00837 


ALLERGEN V5/IPX-1 FAMILY 
SIGNATURE 


PR00837D 11.12 6.023e-09 22-36 


48 


BL01009 


Extracellular proteins SCP/Tpx- 
1/Ag5/PR-1/Sc7 proteins. 


BL01009E 13.50 8.204e-09 21-37 


49 


BL00284 


Serpins proteins. 


BL00284A 15.64 2.350e-20 85-109 
BL00284D 16.34 4.240e-19 323-350 
BL00284C 28.56 5.600e-17 216-258 
BL00284E 19.15 7.500e-14 408-433 
BL00284B 17.99 9.379e-13 189-210 


50 


BL01283 


T-box domain proteins. 


BL01283A 24.15 2.125e-39 148-196 
BL01283B 23.17 9.438e-34 208-250 
BL01283D 1 1.70 7.868e-31 298-331 
BL01283C 13.05 8.448e-16 260-274 


50 


PR00937 


T-BOX DOMAIN SIGNATURE 


PR00937A 15.25 9.182e-26 156-181 
PR00937D 13.41 7.375e-17 259-274 
PR00937B 14.58 8.615e-15 223-237 
PR00937E 11.86 8.541e-14 301-315 
PR00937F 12.53 1.450e-12 322-331 
PR00937C 10.51 1.000e-l 1 240-250 


50 


PR00938 


BRACHYURY PROTEIN FAMILY 
SIGNATURE 


PR00938C 8.28 6.547e-09 264-282 


50 


PR00427 


INTERLEUKIN-8 RECEPTOR 
SIGNATURE 


PR00427A 16.30 6.776e-09 416-431 


51 


PD01270 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


PD01270D 24.66 8.054e-09 50-86 


52 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 2.543e-13 181-221 


52 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 7.682e-ll 150-172 
PR00245C 7.84 5.286e-10 290-306 


52 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237C 15.69 3.700e-09 195-218 
PR00237G 19.63 8.535e-09 326-353 


53 


PR00050 


COLD SHOCK PROTEIN 
SIGNATURE 


PR00050A 1 1.28 3.143e-12 42-58 
PR00050C 9.82 9.15 le-11 85-104 


53 


BL00352 


'Cold-shock' DNA-binding domain 
proteins. 


BL00352B 23.66 2.881e-13 71-110 
BL00352A 12.19 1.327e-10 42-57 


56 


BL01173 


Lipolytic enzymes G-D-X-G family, 
histidme. 


BL01173B 13.27 4.462e-17 140-167 
BL01173C 8.98 4.349e-14 182-196 
BL01173A9.41 1.818e-13 454-467 
BL01173C 8.98 6.553e-13 495-509 
BL0U73A9.41 8.364e-13 107-120 


57 


PR00321 


GAMMA G-PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00321C 15.39 2.473e-12 123-141 


58 


PR00937 


T-BOX DOMAIN SIGNATURE 


PR00937A 15.25 1.000e-24 117-142 
PR00937D 13.41 5.500e«18 220-235 
PR00937B 14.58 5.235e-13 184-198 
PR00937F 12.53 1.450e-12 293-302 
PR00937E 11.86 1.918e-12 259-273 
PR00937C 10.51 3.133e-ll 201-211 
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58 


BL01283 


T-box domain proteins. 


BL01283A 24.15 i.OOOe^O 109-157 
BL01283B 23.17 9.156e-34 169-211 
BL01283C 13.05 8.286e-17 221-235 

TST A1 101TS 1 1 "7A C 1/\n« 1 1 1£A O/M 

BL01283D 11.70 5.709e-ll 2o9-JtK 


58 


PR00938 


BRACHYURY PROTEIN FAMILY 
SIGNATURE 


PR00938C 8.28 7.384e-09 225-243 


59 


PD02059 


CORE POLYPROTEIN PROTEIN 
GAG CONTAINS: P. 


PD02059A 28.10 2.694e-09 116-157 


63 


BL00196 


Ribosomal protein L30 proteins. 


BL00196 34.38 3.250e-15 46-97 


64 


BL00226 


Intermediate filaments proteins. 


BL00226B 23.86 l.205e-3l 264-312 


64 


BL01305 


moaA / nifB / pqqE family proteins. 


BL01305B 10.95 8.875e-09 78-88 


68 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 6.727e-13 33-67 


69 


PR00874 


FUNGHV METALLOTHIONEIN 
SIGNATURE 


PR00874C 437 7.214e-10 68-83 


69 


PD00866 


GLYCOPROTEIN PROTEIN SPIKE 
E2 PRECURSOR PEPLOMER. 


PD00866L 3.73 6.564e-10 1-1 1 PD00866L 
3.73 1.443e-09 26-36 


69 


BLJ00026 


Chitin recognition or binding domain 
proteins. 


BL00026 12.95 3.013e-09 48^69 


69 


DM01724 


kw ALLERGEN POLLEN CM1 HOL- 
LI. 


DM01724 8.14 3.250e-09 10-30 


69 


BL01208 


VWFC domain proteins. 


BL01208B 15.83 6.838e-09 111-126 


69 


BL00243 


Integrins beta chain cysteine-rich 
domain proteins. 


BL002431 31.77 4.838e-10 106-149 
BL002431 31.77 7.221e-10 18-61 BL00243I 
31.77 1.761e-09 41-84 BL002431 31.77 
3.408e-09 31-74 BL002431 31.77 7.465e-09 
71-114 


69 


BL00203 


Vertebrate metallothioneins proteins. 


BL00203 13.94 4.107e-13 66-112 BL00203 
13.94 2.138e-12 92-138 BL00203 13.94 
1.099e-ll 28-74 BL00203 13.94 3.176e-ll 
82-128 BL00203 13.94 3.374e-ll 87-133 
BL00203 13.94 5.846e-ll 77-123 BL00203 
13.94 7.23 le-11 102-148 BL00203 13.94 
1.670e-10 97-143 BL00203 13.94 2.532e-10 
103-149 BL00203 13.94 5.021e-10 88-134 
BL00203 13.94 7.128e-10 38-84 BL00203 
13.94 7.168e-10 107-153 BL00203 13.94 
7.702e-10 73-1 19 BL00203 13.94 9.426e-10 
25-71 BL00203 13.94 1.918e-09 101-147 
BL00203 13.94 2.745e-09 27-73 BL00203 
13.94 4.031e-09 71-117 BL00203 13.94 
4.857e-09 36-82 BL00203 13.94 5.041e-09 
98-144 BL00203 1354 5.1 54e-09 6-52 
BL00203 13.94 6.418e-09 76-122 BL00203 
13.94 7.980e-09 91-137 BL00203 13.94 
8.255e-09 13-59 BL002O3 13.94 8.898e-09 
48-94 


69 


PR00876 


NEMATODE METALLOTHIONEIN 
SIGNATURE 


PR00876B 7.66 9.514e-09 80-94 


73 


PR00875 


MOLLUSC METALLOTHIONEIN 
SIGNATURE 


PR00875A 5.83 9.679e-10 17-29 
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74 


PR00185 


HISTONE H4 SIGNATURE 


PR00185B 13.68 8.888e-09 364-384 


86 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 7.000e-13 200-213 


86 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


TJT AAATQ 1 C A*7 < QCflft 1 1. 9L^X\JSLfH RT AAA9R 

16.07 1.900e-10 184-201 BL00028 16.07 
6.100e-10 371-388 BL00028 16.07 6.914e- 
09 317-334 


86 


PR0G048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 




87 


PD02870 


RECEPTOR INTERLEUKIN-l 
PRECURSOR. 


PD02870D 15.74 8.468e-09 358-393 


88 


BL00048 


Protamine PI proteins. 


82 BL00048 6.39 5.500e-10 70-97 BL00048 
6.39 2.350e-09 62-89 BL00048 6.39 3.700e- 
09 60-87 BL00048 6.39 5.050e-09 63-90 

T>T AAA A O ^ 1ft, £T 000„ AA ^ 1 ft© DT AAA/1 Q 

BL0U04o 6.39 o.288e-09 ol-oo BLUUU48 
6.39 9.438e-09 71-98 


89 


PR0032O 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


rK0032UC 13.01 8.92Ue-lU zUz-zl / 
PR00320B 12.19 9.486e-10 202-217 
pp aao onr* i o ai n qaa<»_aq 700 _^iy7 

PR00320A 16.74 8.902e-09 202-217 


90 


BL00453 


FKBP-type peptidyl-prolyl cis-trans 
isomerase proteins. 


BL00453A 15.57 1.000e-15 81-96 
BL00453C 9.72 1.000e-12 147-160 


92 


PR00299 


ALPHA CRYSTALLIN SIGNATURE 


PR00299B 17.53 7.211e-09 324-337 


93 


PF00676 


Dehydrogenase El component. 


PF00676D 14.40 4.857e-13 421-441 
PF00676C 16.88 1.931e-10 389-413 
PF00676B 24.71 5.433e-10 192-230 


96 


BL00824 


Elongation factor 1 beta/betaVdelta 
chain proteins. 


BL00824B 9.21 3.919e-09 1472-1492 


99 


PR00417 


PROKARYOTICDNA 
TOPOISOMERASE I SIGNATURE 


PR00417A 12.66 5.415e-09 866-880 


102 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.936e-29 17-56 


102 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 2.059e-14 435-452 BL00028 
16.07 7.353e-14 351-368 BL00028 16.07 
2.350e-13 295-312 BL00028 16.07 9.1 OOe- 
13 491-508 BL00028 16.07 2.174e-12 463- 
480 BL00028 16 07 8 826e-12 211-228 
BL00028 16.07 2.038e-ll 379-396 BL00028 
16.07 2.385e-ll 323-340 BL00028 16.07 
3.423e-ll 239-256 BL00028 16.07 9.654e- 
11 407-424 BL00028 16.07 1.000e-10 267- 
284 


102 


BL00479 


Phorbol esters / diacylglycerol binding 
domain proteins. 


BL00479A 19.86 6.362e-09 366-389 


102 


PD02462 


PROTEIN BOLA TRANSCRIPTION 
REGULATION AC. 


PD02462A 22.48 7.695e-09 204-239 


102 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 1.000e-15 460-474 
PR00048A 10.52 1.000e-14 432-446 
PR00048A 10.52 3.250e-14 320-334 
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PR00048A 10.52 4.750e-14 348-362 
PR00048A 10.52 6.250e-14 376-390 
PR00048A 10.52 3.133e-13 292-306 
PR00048A 10.52 1.529e-12 488-502 
PR00048B 6.02 1.000e-ll 33o-34o 
PR00048B 6.02 9.308e-ll 224-234 
PR00048B 6.02 2.688e-l 0476-486 
PR00048B 6.02 3.250e-10 280-290 
PR00048A 10.52 5.696e-10 404-418 
PR00048A 10.52 6.087e-10 264-278 
PR00048B 6.02 6.187e-10 420-430 
PR00048A 10.52 7.214e-10 236-250 
PR00048B 6,02 8.875e-10 364-374 
PR00048B 6.02 3.368e-09 171-181 
PR00048B 6.02 4.316e-09 448-458 


103 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 9.438e-37 1049 


103 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 5.500e-13 413-430 BL00028 
16.07 1.000e-12 273-290 BL00028 16.07 
1.783e-12 357-374 BL00028 16.07 7.577e- 
11 301-318 BL00028 16.07 9.308e-ll 441- 
458 BL00028 16.07 9.308e-ll 469-486 
BL00028 16.07 1.300e-10 329-346 


103 

• 


PR00O48 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7.000e-14 354-368 
PR00048A 10.52 2.286e-13 298-312 
PR00048A 10.52 9.357e-13 270-284 
PR00048A 10.52 3.209e-l 2 410424 
PR00048B 6.02 5.000e-12 286-296 
PR00048B 6.02 1.000e-ll 342-352 
PR00048B 6.02 1.000e-ll 370-380 
PR00048B 6.02 1.125e-10 314-324 
PR00048A 10.52 2.565e-10 466-480 

DD AAA/1 OA 1A O i| COO« lAyl^fi/KI 

rKUUU4oA W.DZ 4.D2Ze-lU 438-4DZ 
PR00048B 6.02 1.474e-09 454-464 

PR00048B 6.02 4.789e-09 482-492 


1 A "5 

103 


PD00066 


rRUlrXN ZllNC-rJJN(jrliK Mill ALr 


pr»nnA f*f. 1 1 oo 9 onAtt i a o so iiy? pnnnnfifi 

13.92 3.769e-15 317-330 PD00066 13.92 
6.538e-15 373-386 PDOO066 13.92 2.800e- 
14 345-358 PD00066 13.92 4.600e-14 457- 
470 PD00066 13.92 4.130e-l 1401-414 
PD00066 13.92 9.654e-10 429-442 PD00066 
13.92 5200e-09 261-274 


103 


BL01024 


Protein phosphatase 2A regulatory 
subunit PR55 proteins. 


BL01024H 13.88 7.353e-09 163-216 


104 


PD01781 


PROTEASE IMMUNOGLOBULIN 
PRECURSO. 


PD01781B 27.55 8.680e-O9 325-369 


105 


PD01781 


PROTEASE IMMUNOGLOBULIN 
PRECURSO. 


PD01781B 27.55 8.680e-09 379-423 


107 


PR00939 


C2HC-TYPE ZINC-FINGER 


PR00939B 1327 3J209e-09 1302-1311 
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SIGNATURE 




108 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 

t 


PD00066 13.92 2.800e-14 279-292 PD00066 
13.92 4.600e-14 307-320 PD00066 13.92 
1.000e-13 335-348 PD00066 13.92 7J00e- 
13 363-376 


108 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 7.882e-14 319-336 BL00028 
16.07 7.300e-13 347-364 BL00028 16.07 
4.913e-12 291-308 BL00028 16.07 2.500e- 
10263-280 BL00028 16.07 1.257e-09 375- 
392 


108 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 4.214e-13 288-302 
PR00048B 6.02 5.000e-12 304-314 
PR00048A 10.52 6.824e-12 372-386 
PR00048A 10.52 7.353e-12 344-358 
PR00048A 10.52 7.158e-ll 316-330 
PR00048B 6.02 7.231e-ll 276-286 
PR00048B 6.02 1.000e-09 332-342 
PR00048B 6.02 6.21 le-09 388-398 


108 


BLOOl 15 


Eukaryotic RNA polymerase II 
heptapeptide repeat proteins. 


BLOOl 15Z 3.12 8.842e-18 96-145 
BLOOl 15Z 3.12 7.144e-17 89-138 
BLOOl 15Z 3.12 6.888e-16 103-152 
BLOOl 15Z 3.12 7.791e-15 82-131 
BLOOl 15Z 3.12 3.947e-14 61-110 
BLOOl 15Z 3.12 7.292e-14 117-166 
BLOOl 15Z 3.12 9.164e-14 110-159 
BLOOl 15Z 3.12 1.000e-13 75-124 
BLOOl 15Z 3.12 3.871e-13 54-103 
BLOOllSZ 3.12 6.819e-13 68-117 
BLOOl 15Z 3.12 4.168e-l 1 124-173 
BLOOl 15Z 3.12 9.651e-10 47-96 BLOOl 15Z 
3.12 7.485e-09 71-120 BLOOllSZ 3.12 

A H f\C\ TO 1 0*7 

9.669e-09 78-127 


109 


PR00193 


■» jf*w tj-\ riTV T TH' A X TXT /"TY T A TXT 

MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 5.680e-33 391-420 
PR00193C 12.60 4.789e-32 156-184 
rKOOiyiJo 11, oy l.oyze-zo iiu-130 

rKUUiyjil 15*.*fr/ D«>UUe-Zl *¥\j-HI*\ 

PR00193A 15.41 4.130e-2054-74 
PR00193E 19 47 5 091e-12 444-473 


110 


BL00239 


Receptor tyrosine kinase class II 
proteins. 


BL00239B 25.15 2.985e-16 67-115 


110 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 8.660e-13 132-151 


110 


BL0O107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 4.462e-25 132-163 j 
BL00107B 13.31 6.143e-10 197-213 j 


110 


DM00406 


GLIADIN. 


DM00406 7.73 1.800e-O9 818-831 


110 


BL00904 


Protein prenyltransferases alpha subunit 
repeat proteins proteins. 


BL00904A 8.30 5.596e-09 815-865 


110 


BL00415 


Synapsins proteins. 


BL00415A 6.15 7.684e-09 796-837 


110 


DM00215 


PROUNE-RICH PROTEIN 3. 


DM00215 19.43 2.373e-09 801-834 
DM00215 19.43 7.712e-09 797-S30 
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110 


PR00209 


ALPHA/BETA GLIADIN FAMILY 
SIGNATURE 


PR00209B 4.88 4.188e-09 817-836 
PR00209C 4.56 8.929e-09 790-804 


111 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 2.800e-10 366-377 BL00678 
9.67 5.263e-09 417-428 BL006789.67 
6.21 le-09 186-197 


111 


PR00308 


TYPE I ANTIFREEZE PROTEIN 
SIGNATURE 


PR00308C 3.83 8.892e-10 108-118 
PR00308C 3.83 8.892e-10 109-119 
PR00308C 3.83 8.364e-09 107-117 


111 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320A 16.74 4.000e-13 364-379 
PR00320B 12.19 7.923e-12 415-430 
PR00320A 16.74 5.966e-l 1 415-430 
PR00320C 13.01 7.214e-l 1 415-430 
PR00320C 13.01 9.217e-ll 364-379 
PR00320A 16.74 9.690e-ll 184-199 
PR00320B 12.19 3.057e-10 184-199 
PR00320C 13.01 6.040e-10 184-199 
PR00320B 12.19 6.657e-10 364-379 
PR00320B 12.19 1.450e-09 457-472 
PR00320C 13.01 2.200e-09 240-255 

PR00320A 16.74 4.732e-09 457-472 

DDnnQ^nA i a ha a aqq*± nn ooi ioic 
rKWoZUA 10./4 0.4ooe-uy ^ol-^i/O 

PR00320C 13.01 1.000e-08 281-296 




T\\Aftf\ZAl 
LfNlKAJJ**/ 


i L-r»r r^iTD f\\jtr\ dd r\\xnr\r\\A a txt 
1 JCW L/XlKLIMiJ xjKUMUlJUiVLAJLlN 

SHADOW GLOBAL. 


TY\Jf/tf\C>l91? 9*2 A1 9 1C 1QA All 

DM00547C 17.30 7.000e-19 23-45 

JL/1Y1UU34 /JC Ij.y 4 * J. 134€-I / 1J3-135 

DM00547D 11.60 2.750e-13 105-119 


112 


BL00315 


Dehydrins proteins. 


BL00315A 9.35 4.246e-10 1301-1329 


112 


PF00426 


Outer Capsid protein VP4 
(Hemagglutinin). 


PF00426S 15.67 6.438e-10 1271-1309 


112 


BL00039 


DEAD-box subfamily ATP-dependent 
helicases proteins. 


BL00039D 21.67 6.793e-10 36&414 


112 


PD02191 


I ATP-BINDING NUCLEOSIDE 
TRANSCR. 


PD02191A 13.95 9.036e-10 107-122 


112 


BL00048 


Protamine PI proteins. 


BL00048 6.39 1.900e-09 1257-1284 
BL00048 6.39 5.050e-09 1258-1285 


1 1 1 

112 


PF00774 


Dihydropyridine sensitive L-type 
calcium channel (Beta subuni. 


PF00774A 16.47 7.130e-O9 1280-1326 
PF00774A 16.47 7.730e-09 1276-1322 


1 19 


w nni i^ 


i!rUKaryouc iviN/\ polymerase 11 
heptapeptide repeat proteins.. 


RT AA1 1^7 ^ 19 ^ AA&f> 11 19^4_1 

BL00115Z 3.12 3.302e-10 1261-1310 
BL00115Z 3.12 4.837e-10 1258-1307 
BL00115Z 3.12 7.767e-10 1251-1300 
BL00115Z 3.12 8.167e-10 1263-1312 
BL00115Z 3.12 8.884e-10 1260-1309 09 
1247-1296 BL00115Z 3.122.985e-09 1240- 
1289 BL00115Z 3.12 5.632e-09 1265-1314 ! 
BL00115Z 3.12 8.676e-09 1253-1302 
BL00115Z 3.12 9.471e-09 1268-1317 
BL00115Z 3.12 9.735e-09 1257-1306 


112 


PF00186 


Flocculin repeat proteins. 


PF00186I9.10 5.290e-13 1279-1309 
PF001861 9.10 6.838e-12 1277-1307 
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PF00l86I9.10 2.957e-ll 1282-1312 
PF00186I 9.10 7.49oe-ll 1276-13UO 
PF00186I9.10 5.200e-10 1268-1298 

PI7AA 1 Q/CT O 1 A *7 /f <Ao 1 A 1 OO© 1 1AC 

PF001861 9.10 7.450e-10 1280-1310 

uttaai o in / <Aii»J\o ioaa ioq£ 
rrUUloOi y.iu *o*oe-w I/00-I/7O 

PF001861 9.10 5.252e-09 1285-1315 

PT7AA1 fi/CT 0 1 A £ CW 1 a_AO 1 070 1 3A0 

rrtiuiooi y.iu o.ojie-uy iz/z-ljuz 

PF001861 9.10 6.102e-09 1274-1304 
ptmn i q/zi o i a n oq^Ca-Ao i ooa i oaa 

PFAftl R/CT Q 1ft R ft1£f»_AQ 10£1 1001 
PFftftl R£T 0 1 ft 0 4^p-ftQ 1 7fv>-1 707 

PF001 861 9.10 9.433e-09 1267-1297 
PF00186IQ 10 1 OfiOe-OR 1756-1286 


114 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 8.788e-ll 237-256 


1 14 


PT AAIR'J 


i yrosine specinc protein ppospnaiascs 
proteins. 


P»T AftlR^F 1ft 3^ 5 377** 1ft 74ft 051 


1 lO 


PP ft fiRRA 




PPftftRR4F 4 75ftp-ftQ AAQ AAA 


117 


PD02890 


ISOMERASF CHAI jTONF— 
FLAVONONE FLAV. 


PD02890C 16 14 8 457e-09 200-235 


118 


BL00226 


Intermediate filaments proteins. 


BL00226B 23.86 6.513e-10 401^49 


118 


BL00326 


Tropomyosins proteins. 


BL00326D 8.76 1.925e-09 196-237 


118 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 2.678e-09 328-382 
Ptf/)1 1 fiftR 10 54 R 017p-ftQ 654-7ftR 


119 


PD01823 


PROTEIN INTERGENIC REGION 
Atari PPFPtrRSOR 

MITOCHONDRION T. 


PD01823C 16.13 7.000e-14 352-373 
PDft1R21B 14 96 1 7R2e-H 12R-^4R 
PD01823D 16.66 6.857e-10 430-451 


119 


pnoi 1 1 5 


PPFPITR^OP AMPHTOTAN ^TTTNJ 
SIGNAL. 


PDftl 1 1 5Pi 17 07 R 471p.no 76R-7R7 


122 


BL00854 


Proteasome B-type subunits proteins. 


BL00854C 29.92 8.435e-19 114-143 


124 

lit 




P iV\r\crvm?i1 TvrrktPiTi T Q nrnfpinc 
XvlUUoUlllal JJ1UIG111 JL£7 piUUCilio. 


P*T 110.651 A 2^ 75 R 477p-1 7 04.1 **4 


125 


m M245 


RT01/7K632 1/MT0444 ftmilv 

proteins. 


RTD1245F 18 75 2 ^7^e-2^ 314-171 
BL01245A 14.04 8.342e-23 206-231 
BL01245C 13.31 6.564e-15 262-282 
BL01245E 15 28 1 000e-12 320-330 
BL01245B 11.91 9.8O9e-10 245-255 


128 


PR00793 


PROLYL AMINOPEPTEDASE (S33) 
FAMILY SIGNATURE 


PR00793C 12.24 1.333e-09 168-183 


128 


PR00111 


ALPHA/BETA HYDROLASE FOLD 
SIGNATURE 


PR00111C 13.46 6.000e-09 182-196 


129 


BL01160 


Kinesin light chain repeat proteins. 


BL01160D 10.17 7.077e-09 505-534 


129 


PD00126 


PROTEIN REPEAT DOMAIN TPR 
NUCLEA. 


PD00126A 22.53 1.000e-08 695-716 


130 


BL00355 


HMG14 and HMG17 proteins. 


BL0O355 5.97 8.412e-32 18-49 


130 


PR00925 


NONHISTONE CHROMOSOMAL 
PROTEIN HMG17 FAMILY 
SIGNATURE 


PR00925B 3.73 3.400e-16 3447 PR00925A 
5.47 1.750e-15 18-33 PRO0925C 5.57 
9.824e-09 51-62 


131 


PR00041 


CAMP RESPONSE ELEMENT 


PR00O41E 7^0 2.976e-13 305-326 
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BINDING (CREB) PROTEIN 
SIGNATURE 




131 


BL00036 


bZIP transcription factors basic domain 
proteins. 


BL00036 9.02 4. 103e-09 299-3 12 


132 


PR00211 


GLUTEUN SIGNATURE 


PR00211B 0.86 1.750e-09 205-226 
PR00211B 0.86 8.750e-09 199-220 


132 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 4.529e-ll 201-234 

r\liA(\(Y>1 S 10 A\ 9 7fiR*»-10 109-99^ 

nMno9i 5 10 /n & ns4p-in 9n9-9^^ 

LJLvlXJKJ*. LJ 17.HJ H.UJ*tC-lV ^VL"iJj 

DM00215 19.43 6.304e-10 207-240 
DM0O215 19 4^ 7 420e-10 180-911 
DM00215 19.43 8.393e-10 196-229 
DM00215 19.43 8.714e-10 218-251 
DM00215 19 43 6 034e-09 185-218 
DM00215 19.43 6.034e-O9 219-252 
DM00215 19.43 6.492e-09 223-256 
DM00215 19.43 7.254e-09 200-233 
DM00215 19.43 9.390e-09 189-222 
DM00215 19.43 9.695e-09 213-246 


133 


BL00455 


Putative AMP -binding domain proteins. 


BL00455 13.31 5.125e-ll 293-309 


133 


PR00154 


AMP-BINDING SIGNATURE 


PR00154A 8.88 6.276e-09 286-298 


136 


PD00015 


GLYCOPROTEIN PRECURSOR 
CELL SI. 


PD00015A 8.90 6.400e-09 243-251 


138 


BL00227 


Tubulin summits alpha, beta, and 
gamma proteins. 


BL00227B 19.29 1.000e-40 52-107 
BL00227C 25.48 l.OOOe^O 113-165 
BL00227A 24.55 8.200e-36 1-35 


140 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.377e-13 60-75 PR00049D 
0.00 7.500e-10 63-78 PR00049D 0.00 
8.071e-10 61-76 


140 


PR00806 


VINCULIN SIGNATURE 


PR00806B 428 8.440e-09 68-82 


140 


BL009O4 


Protein prenyltransfeiases alpha subunit 
repeat proteins proteins. 


BL00904A 8.30 9.553e-09 60-1 10 


141 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 6.438e-12 1175-1190 


141 


BL01187 


Calcium-binding EGF-like domain 
proteins pattern proteins. 


BL01187B 12.04 5.800e-ll 1284-1300 
BL01187B 12.04 8200e.ll 180-196 


141 


BL01248 


Laminin-type EGF-like (LE) domain 
proteins. 


BL01248 1 1.02 4.343e-12 1362-1375 [ 
BL01248 11.02 2.350e-l 1322-335 BL01248 
11.02 4.125e-10 271-284 


141 


PR00764 


COMPLEMENT C9 SIGNATURE 


PR00764B 13.56 3.475e-09 1047-1068 


141 


PR00010 


TYPE H EGF-LIKE SIGNATURE 


PR00010C 11.16 4205e-09 185-196 


141 


BL01113 


Clq domain proteins. 


BL01113A 17.99 5.673e-09 1621-1210 


141 


PR00011 


TYPE m EGF-LIKE SIGNATURE 


PR00011D 14.03 8.895e-12 551-132 
PR00011B 13.08 5.846e-ll 551-132 
PR00011D 14.03 3.2l5e-10 313-332 
PR00011A 14.06 4.214e-10 313-332 
PR00011B 13.08 7.783e-10 313-332 
PR00011A 14.06 7.781e-09 551-132 


141 


BL00420 


Speract receptor repeat proteins domain 


BL00420A 20.42 8.200e-09 1186-1215 
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proteins. 




141 


PD02510 


ISOMERASE GALACTOSE-6- 
PHOSPHATE. 


PD02510B 18.31 8.170e-09 548-144 


141 


PR00261 


LOW DENSITY LPOPROTEIN 
(LDL) RECEPTOR SIGNATURE 


PR00261F 11.57 9.544e-09 1052-1074 


141 


PR00288 


PUROTHIONIN SIGNATURE 


PR00288C 10.15 9.165e-09 311-326 


142 


DM01970 


0 kw ZK632.12 YDR313C 
FTsrnn^nMAT tit 


DM01970B 8.604.750e-17 114-565 


142 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 2.373e-09 203-257 


1/fO 


RT AA^I c 


7: n « finer f*r P^MP4 hm» /"PTWO 

z<inc linger, V/jnv/* type ^ivjinvj 
finger), proteins. 


RT AO^l 8 1 2 23 4 000e-09 559-1 30 


142 


BL00422 


Granins proteins. 


BL00422E 26.86 8.615e-09 462-498 


143 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 5.846e-15 141-154 PD00066 
13.92 9.217e-l 1 551-564 PD00066 13.92 j 

£ 1t\(\a. AO 

o. /uue-uy jzjo jo 


143 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 9.526e-ll 122-136 

dd AAA/1 0 A lA^OO 1 '7ilo lAC-ao ^/1A 

PR00048A 10.52 6.087e-10 588-164 
PR00048B 6.02 7.632e-09 138-148 
PR00048A 10.52 8.920e-09 504-518 


143 


TVCAA£C 1 


15 1 15 (also Known as UK.-W 1 ucj domain 
proteins. 




1 A *5 

143 


BLUUU28 


zinc ringer, K^zriz type, domain 
proteins. 


RT OnOOC 1 £ 07 7 ^77#»-1 1 ^^-1 1 4 RT .00078 

16.07 2.200e-10 125-142 BL00028 16.07 
S 800e-10 507-524 BL00028 16 07 8 714e- 
09 591-170 BL00028 16.07 9.743e-09 444- 
461 


144 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 3.672e-10 262-285 




rt no9i <; 

DLUUZ l J 


\/f itnr»V»rvTiHr"icil or»frrr\/ tmncfV^r rimfpinc 
IVlllAJUUlJlIUllai dIC/lgjr UalloJLGl JJlUlvlllo. 


RTI10215A 15 82 7 900e-15 16-41 
BL00215A 15.82 8.147e-14 260-285 
BL00215A 15 82 1 804e-09 166-191 
BL00215B 10.44 5.500e-O9 114-127 


1/1/1 

LH*f 


pbooq97 


ADENINE NUCLEOTIDE 
TRANSLOCATOR 1 SIGNATURE 


PR00927B 14 66 8.644e-09 104-126 


147 


DM01417 


6 lew INDUCING XPMC2 
MUSHROOM SPAC22G7.04. 


DM01417C 12 93 3.250e-ll 267-279 
DM01417D 11.08 2.200e-10 306-322 


148 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 8.378e-10 349^03 


151 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.807e-ll 419-434 
PR00049D0.008.125e-ll 1284-1299 
PR00049D 0.00 3.929e-10 1283-1298 
PR00049D 0.00 3.288e-09 417-432 


151 


BL00904 


Protein prenyltransferases alpha subunit 
repeat proteins proteins. 


BL00904A 8.30 3.553e-09 416-466 


154 


BL00665 


Dihydrodipicolinate synthetase 
proteins. 


BL00665D 14.76 1.000e-ll 109-132 
BL00665C 25.58 5.832e-ll 50-101 


154 


PR00146 


DIHYDRODIPICOLINATE 
SYNTHASE SIGNATURE 


PR00146D 16.26 2.525e-10 108-126 
PR00146A 12.62 8.615e-09 13-35 


156 


PD02906 


SYNTHASE I PSEUDOURIDYLATE 
PSEUDOURIDINE LYASE TR 


PD02906C 24.17 9.1 15e-15 171-206 
PD02906B 15.35 4.886e-13 142-155 
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PD02906D 12.27 1.000e-09 239-249 
PD02906A 10.84 8.333e-09 92-105 


157 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107B 13.31 2.286e-ll 396-412 
BL00107A 18.39 6.148e-ll 332-363 


157 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 4.938e-09 332-351 


160 


PF01008 


Initiation factor 2 subunit 


PF01008B 25.59 9.171e-36 366409 
PF01008A 20.14 8.676e-12 315-336 
PF01008C 12.25 7382e-10 449-469 


161 


BL00591 


Glycosyl hydrolases family 10 proteins. 


BL00591D 8.33 6.167e-09 2099-21 12 


163 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 7. 120e-09 99-113 
PR00019B 1 1.36 7.840e-09 73-87 


164 


BL00198 


Nt-dnaJ domain proteins. 


BL00198A 8.07 3.000e-14 143-160 


164 


PR00187 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00187A 12.84 8.800e-12 139-159 


165 


PR00310 


ANTI-PROLIFERATIVE PROTEIN 
BTG1 FAMILY SIGNATURE 


PR00310B 10.59 4.000e-39 41-71 
PR00310C 12.74 2.256e-33 71-101 
PR00310D 9.109.820e-33 101-131 
PR00310A 11.17 7.000e-27 16-41 


165 


BL00960 


BTG1 family proteins. 


BL00960B 24.47 1.000e-40 34-79 
BL00960C 12.68 6.745e-21 98-120 
BL00960A 10.98 5.304e-12 14-26 


166 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 2.688e-21 124-174 


166 


DM00973 


3 kw RESISTANCE BENOMYL 
YLL028W CYCLOHEXIMIDE. 


DM00973A 21.17 4.162e-10 96-133 


166 


PR00171 


SUGAR TRANSPORTER 
SIGNATURE 


PR00171D 12.76 3.520e-13 456-478 
PR00171E 14.87 2.750e-09 479-492 


166 


PR00172 


GLUCOSE TRANSPORTER 
SIGNATURE 


PR00172D9.13 6.513e-09 456480 
BL00216B 27.64 5.198e-20 124-174 


167 


BL00216 


Sugar transport proteins. 




167 


DM00973 


3 kw RESISTANCE BENOMYL 
YLL028W CYCLOHEXIMIDE. 


DM00973A 21.17 4.162e-10 96-133 


168 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 5.929e-32 59-98 


168 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 2.385e-15 520-533 PD00066 
13.92 2.800e-14 296-309 PD00066 1352 
5.200e-14 240-253 PD00066 13.92 SJ200e- 
14 548-561 PD00066 13.92 9.400e-14 436- 
449 PD00066 13.92 1.000e-13 324-337 
PD00066 13.92 6.143e-12 352-365 PD00066 
13.92 6.885e-10 268-281 


168 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048B 6.02 6.000e-12 237-247 
PR00048A 10.52 6.294e-12 333-347 
PR00048A 10.52 6.824e-12 361-375 
PR00048A 10.52 9.471e-12 249-263 
PR00048A 10.52 4.316e-ll 119-133 
PR00048A 10.52 4.789e-ll 529-543 
PR00048A 10.52 6.684e-ll 445-459 
PR00048A 10.52 8.141e-ll 305-319 
PR00048B 6.02 6.063e-10 321-331 
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PR00048B 6.02 6.063e-10 517-527 
PR00048A 10.52 7.261e-10 221-235 
PR00048B 6.02 7.750e-10 545-117 
PR00048B 6.02 1.474e-09 293-303 
PR00048A 10.52 2.800e-09 389^03 
PR00048A 10.52 1.000e-08 417^31 


170 


PR00456 


RIBOSOMAL PROTEIN P2 
•SIGNATURE 


PR00456E 3.06 2.820e-l 1 6-21 PR00456E 
3.06 7.125e-10 3-18 


170 


PD02331 


CYCLIN CELL CYCLE DIVISION 
PROTE. 


PD02331A 19.76 7.429e-15 93-140 
PD02331B 13.43 1.125e-09 174-207 


170 


PR00833 


POLLEN ALLERGEN POA PI 
SIGNATURE 


PR00833H 2.30 5269e-09 3-18 


171 


PD00126 


PROTEIN REPEAT DOMAIN TPR 
NUCLEA. 


PD00126A 22.53 4.706e-14 140-161 
PD00126A 22.53 6.824e-14 289-310 


173 


BL00741 


Guanme-nucleotide dissociation 
ctimiilatnTC {"DC94 familv <n'(?n 

Ol III luwwifl i— ' * — • X- ~ ±aiiu±y oieu. 


BL00741B 14.27 3.418e-U 294-317 


173 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 5.154e-ll 86-102 


1 77 




NFT1TROPHTT CYTOSOI FACTOR 
P40 SIGNATURE 


PR00497D 11 91 5 962e-10 91-113 


173 


PF00564 


Octicosapeptide repeat proteins. 


PF00564B 24.74 6.442e-09 277-328 


175 


BL01016 


Glycoprotease family proteins. 


BL01016C 22.84 5.292e-19 60-105 
Rri)l 01 6H 13 71 6 157e-12 307-317 
BL01016E 14.88 3.l82e-ll 141-169 
BL01016D 8.86 6.741e-09 118-131 


175 


PR00789 


O-SIALOGLYCOPROTEIN 
FNDOPEPTIDASE fM22i 
METALLO-PROTEASE FAMILY 
SIGNATURE 


PR00789E 12.42 7.128e-14 141-163 
PR00789C 16 11 2 707e-12 85-105 
PR00789B 10.48 1.205e-09 64-85 
PR00789D 8.17 7.151e-09 118-131 


176 


PR00850 


GLYCOSYL HYDROLASE FAMILY 
59 SIGNATURE 


PR00850B 6.67 5.455e-09 148-173 


178 


PR00259 


TRANSMEMBRANE FOUR FAMILY 
SIGNATURE 


PR00259A 9.27 8.676e-20 17-41 PR00259C 
16.40 4.750e-l 7 85-114 PR00259B 14.81 
8.615e-12 58-85 PR00259D 13.50 2.528e-ll 
235-262 


178 


BL00421 


Transmembrane 4 family proteins. 


BL00421B 17.62 6.186e-17 64-103 
BL00421A 11.796.800e-12 13-32 
BL00421E 20.97 1.514s-10 232-262 
BL00421C 12.89 3.600e-09 147-159 


178 


PR00235 


HERPESVIRUS MAJOR CAPSID 
PROTEIN (MCP) SIGNATURE 


PR00235A 14.64 8.000e-09 87-1 1 1 


179 


BL01052 


Calponin family repeat proteins. 


BL01052C 18.51 6.806e-40 87-127 
BL01052A 16.12 7.180e-32 3-35 BL01052B 
15.31 8.031e-26 52-78 BL01052D 10.26 
1.000e-24 174-194 


179 


PR00890 


SMOOTH MUSCLE PROTEIN 22- 
ALPHA (TRANSGELIN) 
SIGNATURE 


PR00890E 14.34 3.8 13e-21 135-155 
PR00890A 8.61 9.775e-21 34-54 PR00890C 
8.22 1.000e-17 84-98 PR00890B 8.75 
3.455e-17 62-78 PR00890F 12.92 4.064e-14 
161-174 PR00890D 16.17 5.174e-13 118- 
128 
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iicscnpnon 




179 


nn aaoqq 


PROTEIN/CALPONIN FAMILY 
SIGNATURE 


PPAARRRU 0 07 S 1 ^Ap-J)C\ 17S-1Q1 
rivuUooon y*y i j.i^tc-au i / j*i7i 

PR00888C 1227 5.179e-18 52-68 

PR00888D 16.09 4273e-17 88-105 

PR00888A 11.87 2.350e-16 3-18 PR00888E 

11 R1 1 £V)f>.\(\ 1fl±-17fl PBftORRRF7 44 
A R75p-1A 17^-14.fi PPfiftRRRfr 17 71 R 7SQp- 

14 162-176 PR00888B 13.72 2.3 50e- 12 22- 
36 


179 


PR00889 


CALPONIN SIGNATURE 


PR00889E 12.18 2.726e-12 171-187 


1 OA 

180 


BL00875 


Bacterial type II secretion system 
protein D proteins. 


tjt aaooc a K £ /ti47o AO 1A7 lOO 


181 


PD01351 


PROTEIN REPEAT 
NEUROFILAMENT TRIPL. 


T>r\A1 1C1D 11 7*> < 1<<o AO 110 O/C/l 

rLHHJMo 13. /Z j.Jjje-0y Z35-Z04 


182 


DM01354 


t r|iy> AXTOr'n iiyi' A Of? CD TT 

kw TRANSCRIPTASE REVERSE 11 
ORF2. 


TYKiTAI IC/fll 1 O AA 0 OO/Io 0*7 1 AO 1 AQ 

DMU13>4ii lo.UU o.oZoe-Z/ 
DM01354G 11.57 2.143e-25 78-109 

r\A)TA1 KMT? 1 A 1 A1 At* 1 < AO *7ft 
UMUlJD4r 14. DO 1.414e-lD *tZ-/5 

DM01354E 18.69 8.650e-14 17-47 


1 oo 
182 




Renal dipeptidase proteins. 




185 


BL00039 


DEAD-box subfamily ATP-dependent 
hebcases proteins. 


BL00039A 18.44 4.000e-25 222-261 
BL00039D 21.67 4.529e-23 498-544 
BL00039C 15.63 4.300e-16 347-371 
rt nnn^oR i o i o o ^70^-1 ^ ifo -7RR 


185 


PD00302 


PROTEASE POLYPROTEIN 

UVTlPP/T AQP A CP 


PD00302B 9.52 1.346e-09 234-250 


186 


PD00066 


PROTEIN ZINC-FINGER METAL- 


PD00066 13.92 5.714e-12 152-165 PD00066 

1 07 f\ 1 dip 17 1 7A_1 XI 


186 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 6.885e-ll 136-153 BL00028 
16.07 2.200e-10 197-214 


150 


PPAAOIO 

rK.UUZ3V 


TERMINAL TAIL SIGNATURE 


PPAA71QP 1 ^R ^ 7Q5p fiQ ri7ft A.X) 


1 c< 

loo 


l*KUUU4o 


POUO TVPTJ *7TMP TTTMflPP 

SIGNATURE 


ppnnfi^fiA in *7 7 o^7«> ini^ 147 
rivuuu^o/v iu.jz z.10 /e-iu uj-ih/ 

PR00048A 10.52 3.739e-10 194-208 
ppAAfuc a m ^7 r nd^p-10 i^i-i7s 

PR00048B 6.02 8.105e-09 12M31 


1 C7 


TXI A1 AOO 


x i ivz iamiiy proion/uiigopepuue 
symporters proteins. 


dt Ai 077"R 77104 740p-1 0 10R-154 

J3JL.U 1 vxZD 4,£.±y *f^WC 1U jUO^Jt 


15/ 


PPAA££Q 


TKTWTRTM AT PT-T A PT4ATKT 
IXNlllDlIN AJL-rrlA l^xl>VJJN 

SIGNATURE 


PUnn/^AQR R 77 7 01^ft-ftQ 764.7R1 


190 


PR00830 


ENDOPEPTIDASE LA (LON) 
SERINE PROTEASE (S16) 
SIGNATURE 


PR00830A 8.41 3.342e-09 881-901 


191 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 9.234e-13 261-280 


191 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 1.000e-23 261-292 
BL00107B 13.31 1.000e-12 341-357 


191 


BL00239 


Receptor tyrosine kinase class II 
proteins. 


BL00239B 25.15 6.523e-10 196-244 


191 


BL00479 


Phorbol esters / diacylglycerol binding 
domain proteins 


BL00479C 12.01 1.000e^)9 320-333 


191 


PR00834 


HTRA/DEGQ PROTEASE FAMILY 


PR00834F 10.91 2.946e-09 786-799 
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SIGNATURE 




193 


BL01033 


Globins profile. 


BL01033A 16.94 2.385e-18 25^7 




PR00R14 


BETA HAEMOGLOBIN 
SIGNATURE 


PR00814A 12.94 1.000e-22 30-47 
PR00814B 9.18 7.750e-18 48-64 


iyj 


PRODI 75 


MYOGLOBIN SIGNATURE 


PR00175B 9.02 9.392e-10 25-49 


104 
iy*+ 


PR 00520 


fi-PROTEIN BETA WD -40 REPEAT 
SIGNATURE 


PR00320B 12.19 6.226e-ll 140-155 
PR00320A 16.74 4.971e-10 140-155 
PR00320C 13.01 9.280e-10 140-155 


194 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 7.632e-09 142-153 


196 


PR00832 


PAXILLIN SIGNATURE 


PR00832B 9.87 9.174e-10 309-333 


196 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 2.054e-10 376^30 
BL01160B 19.54 6.919e-10 383-437 
BTjOI 160B 19 54 9 676e-10 369-423 


196 


PR00049 


WILNf S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.780e-09 40-55 


196 


BL00087 


Copper/Zinc superoxide dismutase 
proteins. 


BL00087C 20.18 8.784e-09 260-296 


196 


ITD AAOAiC 


\7TMYT TT TXT CffTXT A TT TD 17 




196 


ni AAII^ 

mjUUJ/o 


Tropomyosins proteins. 


P.T ftfY*9£A Id 01 0 14^*»-0Q S06-S40 


197 


PR00674 


LIGHT HARVESTING PROTEIN B 

/^T-T A TKT QTrSXIATTTRP 


PR00674A 20.10 7.391e-09 134-155 


1 GO 


Dt> AA1 OO 


T7 A r"TTXT C A PPTXin PP OTPTM rpt A 
r-AL 1 UN v^ivrr UNO risSJ 1 JC.JLIN Dlj 1 J\ 

SUBUNIT SIGNATURE 


PROOIO^r* 5O0f»-lfi ^7-R4 PR001Q9D 

8.23 4.462e-36 97-125 PR00192E8.85 

7 000e-33 212-239 PR00192A 8 23 1 474e- 

27 5-26 PR00192B 6 20 3.000e-26 26-48 


198 
iyo 




F^arrin canning nmtein heta suhimit 
proteins. 


BL00231A 8.59 1.000e-40 5-51 BL00231B 
14.16 l.OOOe^O 84-128 BL00231D 15.40 
1.000e-40 165-200 BL00231E 11.66 l.OOOe- 
40 209-246 BL00231C 12.77 1.180e-15 146- 
157 


199 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 4.750e-10 45-61 


199 


PF00791 


Domain present in ZO-1 and Unc5-like 
netrin receptors. 


PF00791B 28.49 8.768e-12 87-142 
PF00791B 28.49 7.028e-09 499-1 16 


199 


BL01160 


Kinesin light chain repeat proteins. 


BL01160E 8.74 7.398e-09 323-362 


201 


PR00239 


MOLLUSCAN RHODOPS1N C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 6.114e-09 183-195 


202 


BL00412 


Neuromodulin (GAP-43) proteins. 


BL00412D 16.54 4.033e-10 3 19-370 


202 


BL00224 


Clathrin light chain proteins. 


BL00224B 16.94 4.845e-09 313-366 


202 


PF00992 


Troponin. 


PF00992A 16.67 8.734e-12 333-368 
PF00992A 16.67 2.776e-09 344-379 
PF00992A 16.67 5.026e-09 351-386 


203 


BL00790 


Receptor tyrosine kinase class V 
proteins. | 


BL00790R 1620 7.677e-09 29-73 


204 


BL00790 


Receptor tyrosine kinase class V 
proteins. 


BL00790R 1620 7.677e-09 29-73 


205 


BL00790 


Receptor tyrosine kinase class V 
proteins. 


BL00790R 16.20 7.677e-09 29-73 


207 


BL00211 


ABC transporters family proteins. 


BL00211B 13.37 3.077e-17 573-167 
BL00211B 13.37 7.577e-17 1204-1674 
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BL00211A 12.23 1.900e-09 472-484 




PR00478 


PHOSPHORIBULOKINASE FAMILY 

X 11V/U1 X XV^XVXJLP Ul^vlwl ^* WJl—l X j U'l ■ ■ * 

SIGNATURE 


PR00478A 13.44 4.133e-09 474-492 


£\i l 


pp 00X02 


SFRIJM AI BIJMIN FAMILY 

OX>X\.VsIVX A JUL J KJ i YXX1 ^1 X /liriLLf 1 

SIGNATURE 


PR00802G 14 57 7 188e-09 971-994 


907 


PR 00836 


SOMATOTROPIN HORMONE 
FAMILY SIGNATURE 


PR00836D 13 05 7 125e-09 1504-1519 


90Q 


PR0004Q 


WIT MS TT JMOIIR PROTEIN 
SIGNATURE 


PR00049D 0 00 1 786e-10 288-303 


210 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 

famtlv 9 , nrr»tpitic 

iajiiny z piuiciiib. 


BL00972D 22.55 3.348e-ll 388-413 
RT /10Q72F 20 72 4 343e-09 415-437 


210 


PR00198 


ANNEXIN TYPE H SIGNATURE 


PR00198H 12.05 7.750e-09 682-696 1 


214 


PD00469 


PROTEIN PRECURSOR SIGNAL 

ill UKULA. 


PD00469A 13.95 6.400e-09 73-86 


215 


PF00023 


Auk repeat proteins. 


PF00023A 16.03 8.875e-10 839-855 
PF00023A 16.03 2.286e-09 884-900 


213 


FKUU342 


KhLcoUo xSUJUlJ uRUUr rKU lEliN 

SIGNATURE 




on 

217 




Bacterial-type phytoene dehydrogenase 
proteins. 


TJT /1AQOOA ID ^1 C/l1Q<» 19 "39C 16ft 

ULuuyoZA I0.41 o.uije-iz jzooou 


zl/ 


rKUUJO© 


rxn T\T?DtrXTTM7KrT "DVT? TT^TXTC 

NUCLEOTIDE REDUCTASE 

QTfTM A TT TP P 


JtxvUUjOov^ 1j. /*♦ o.yoze-11 j/,0-jDa% 


217 


PR00469 


PYRIDINE NUCLEOTIDE 

r>T<3TTT PHTTiP PPTiTTPTA^F PT ASS- 

UluULJrniL/E luiLJUv/ Xr\O.C< V^X-^rVOLj— 

H SIGNATURE 


PR004691 13.83 7.532e-ll 449-468 
PRflA4f>9F 16 51 7 152e-09 322-347 


917 
41/ 




re nisi j?t tt pt ro pt ppTRmsi 
TRANSPORT AROMATIC 
HYDROCARB 


PTVI9049R Ifi 7S S rfvnp-09 196-141 
PD02042A 21.13 9.045e-09 93-120 


217 


PR00419 


ADRENODOXIN REDUCTASE 
FAMILY SIGNATURE 


PR00419A 14.89 9.486e-09 326-349 
PR00419D 10 62 9 534e-09 327-342 


218 


PF00157 


PDZ domain proteins (Also known as 
DHRorGLGF). 


PF00157 13.40 4.600e-09 688-699 


219 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 7.000e-23 65-96 
BL00107B 13.31 4.214e-10 130-146 


219 


PR00109 


TYROSINE KINASE CATALYTIC 

TifiM ATM QTfTKTATTTPP 


PR00109B 12.27 7.102e-10 65-84 


219 


BL00240 


Recentor tyrosine kinase class HI 
proteins. 


BL00240E 11.56 5.029e-09 51-89 


220 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 3.045e-09 38-50 


220 


DM01803 


1 HERPESVIRUS GLYCOPROTEIN 
H. 


DM01803A 10.51 9.349e-09 34-55 


220 


PR00049 


WILMS TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 5.160e-l 1 40-55 PR00049D 
0.00 7.807e-ll 41-56 PR00049D 0.00 
8.336e-ll 38-53 PR00049D 0.00 2^86e-10 
42-57 PR00049D 0.00 8.857e-10 33-48 
PR00049D 0.00 2.983e-09 37-52 PR00049D 
0.00 9.847e-09 43-58 


222 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 5337e-10 825-859 
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222 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 9.924e-09 516-132 


224 


BL00478 


LIM domain proteins. 


BL00478B 14.79 8.527e-09 143-158 


226 


BL00048 


Protamine PI proteins. 


BL00048 6.39 6.063e-O9 199-226 


228 


BL00115 


Fulfflrvotic RNA oolvnierase 17 

JjUluU VvUv Ivl l/l VwJ JrlUwl WV A A 

hentaneotide reneat oroteins. 


BL00115Z3 12 5 744e-10 113-162 
BL00115Z 3.12 3.449e-09 120-169 


228 


BL00415 


Synapsins proteins. 


BL00415Q 2.23 8.723e-09 253-289 




BLOIIfil 


l - ¥liiro^f>minp/cTJil5irtnQflTnir»p-/v- 

VJl Ui/V/SolUXllw ff<I tflvtUaCUim lv~r 

phosphate isomerases proteins. 


BL01 1 61 A 19 47 1 000e-40 37-77 
BL01161D 28.14 1.000e-40 199-244 
BL01161B 21 37 5 091e-39 117-160 
BL01161C 18.47 1.500e-23 170-199 


231 


PR00269 


PLEIOTROPHIN/MIDKINE FAMILY 
SIGNATURE 


PR00269A 13.91 3.133e-30 88-113 


231 


BL00181 


PTN/MK heparin-binding protein 


BL00181A 19.07 4.960e-37.76-l 12 
BL00181A 1907 9 224e-18 78-114 


236 


BL00888 


Cyclic nucleotide-binding domain 

JJlVJlClilD, 


BL00888B 14.79 9.069e-13 499-523 


236 


BL0G415 


Synapsins proteins. 


BL00415N 4.29 2.774e-09 733-777 


236 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 3.133e-09 646-660 


236 


PR00209 


ALPHA/BETA GLIAD1N FAMILY 
SIGNATURE 


PR00209B 4.88 3.813e-09 739-758 


236 


DM00668 


ZEIN. 


DM00668A 10.20 8.500e-09 258-273 


238 


BL01188 


GNS1/SUR4 family proteins. 


BL01188B 13.464.115e-26 120-151 
BL01188C 22.65 4.136e-26 151-202 
BL01188D 8.62 U90e-ll 238-255 
BL01188A 18.82 6.718e-10 55-87 


239 


PR00929 


AT-HOOK-LKE DOMAIN 
SIGNATURE 


PR00929B 4.38 8.875e-09 133-583 
PR00929C 526 8.914e-09 133-144 


242 


BL00232 


Cadherins extracellular repeat proteins 
domain proteins. 


BL00232B 32.79 2.765e-25 541-151 
BL00232B 32.79 8.263e-22 766-814 
BL00232B 32.79 2.397e-21 67-115 
BL00232B 32.79 4.133e-19 1481-1529 
BL00232B 32.79 1.000e-18 1371-1419 
BL00232B 32.79 2.662e- 18 1691-1739 
BL00232B 32.79 5.292e-18 1287-1335 
BL00232B 32.79 9.147e-18 1148-1196 ■ 
BL00232B 32.79 1.265e-17 980-1028 
BTi)0232B 32 79 1 529e-l 7 426-474 
BL00232B 32.79 2.588e-17 1084-1 132 
BL00232B 32.79 1.386e-16 1184-1232 
BL00232C 10.65 5.390e-I2 1369-1387 
BL00232C 10.65 1.391e-ll 204-660 
BLO0232C 10.65 2.174e-ll 1584-1164 
BL00232C 10.65 4.522e-ll 1689-1707 
BL00232C 10.65 1.000e-10 65-83 
BL00232C 10.65 4.115e-10 1285-1303 
BL00232B 32.79 7.200e-10 649-697 
BL00232C 10.65 9.827e-10 978-996 ! 
BL00232C 10.65 1.947e-09 170-188 
BL00232B 32.792.137e-09 172-220 
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BL00232C 10.65 4.474e-09 1182-1200 
BL00232C 10 65 8 737e-09 539-119 


243 


BL00795 


Involucrin proteins. 


BL00795C 17.06 4.977e-10 64-109 
BL00795C 17 06 6 300e-O9 55-100 


244 


BL00790 


Receptor tyrosine kinase class V 
proteins. 


BL007901 20.01 7.823e-15 23-54 BL00790I | 
20 01 9 40Ge-ll 310-341 BL007901 20.01 
1.900e-10 117-148 BL007901 20.01 3.893e- 
09215-246 


244 


PR00014 


FBRONECTIN TYPE HI REPEAT 

OT/"" r XT A TI TOU 

blON A 1 UKb 


PR00014D 12.04 6.400e-ll 3045 
ppnnniAn 19 ndfi4nop-ii 317-332 

PR00014C 15.44 9.171e-09 204-223 


245 


T*« x Art t Ol 

BL00183 


: — n : : 

Ubiquiun-conjugating enzymes 

proteins. 


RT fini S3 9R Q7 7 A37p-1 0 140-1 88 


246 


PR00019 


LEUCINE-RICH REPEAT 
MONAlUKb 


PR00019A 11.19 8.800e-12 205-219 


247 


BL00214 


Cytosolic fatty-acid binding proteins. 


BL00214B 26.51 7.180e-24 206-251 
BL00214A 21.17 6.250e-22 165-191 


247 


PR00178 


FATTY ACID-BINDING PROTEIN 
SIGNATURE 


PR00178A 15.07 4.913e-21 166-187 
ppnni7ftp on 9 snfip_i 7 996-9^4 

PR00178D 13.52 6.897e- 16 272-291 
PR00178B 10.52 4.900e-10 200-212 


248 


PR00395 


T> - m/~\C'/^X /AT DO fYTO TNT O 

SIGNATURE 


PPnmo^r 1^17 9 OA7p-13 46-64 


248 


BL00962 


Ribosomal protein S2 proteins. 


BL00962C 15.90 2.846^12 46-64 


249 


BL00227 


Tubulin subunits alpha, beta, and 
gamma proteins. 


rt nn997n ir 46 i nnnp.40 74-198 
BL00227F 21.16 1.529e-33 226-280 
RTD0997E94 15 1 409e-26 178-213 


250 




XUDUiin suDumis aipna, oeia, ana 
gamma proteins. 


WlftttXyiC 95 48 1 000e-40 39-91 
BL00227D 18 46 1 000e-40 148-202 
BL00227F 21.16 1.529e-33 300-354 
BL00227E 24 15 1.409e-26 252-287 


251 


BL00152 


ATP synthase alpha and beta subunits 
proteins. 


BL00152B 21.40 L900e-31 191-229 
BL00152A 15.38 5.154e-21 134-160 
BL00152C 11 41 6.250e-12 291-303 


252 


BL00152 


ATP synthase alpha and beta subunits 
proteins. 


BL00152E 22.68 1.000e-32 285-323 
BL00152A 15.38 5.154e-21 134-160 
BL00152C 11.41 6.250e-12 247-259 


253 


BL00518 


Zinc ringer, C3HC4 type (RING 
finger), proteins. 


BL00518 12.23 2.200e-ll 54-« 


253 


BL01282 


BIR repeat proteins. 


BL01282B 30.49 2.029e-09 35-74 


254 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 9.739e-12 417^51 


254 


PR00417 


PROKARYOTIC DNA 
TOPOISOMERASE I SIGNATURE 


PR00417A 12.66 8.472e-09 65-79 


255 


BL01052 


Calponin family repeat proteins. 


BL01052C 18.51 1.000e-40 88-128 
BL01052A 16.12 2.875e-35 3-35 BL01052B 
15.31 5.219e-26 52-78 


255 


PR00888 


SMOOTH MUSCLE 
PROTEIN/CALPONIN FAMILY 
SIGNATURE 


PR00888D 16.09 9.112e-19 89-106 
PR00888E 11.81 2.800e-18 105-121 
PR00888F 7.44 4.600e-18 126-141 
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PPHARRRA 11 R7 7 750p 1 R ^ 1R PPftORRRP 
17 01 0 7R6e-1 7 52-6R PR00888G 1 2 73 
Q4^Rp-15 16^.177 PPfiftRRRR 13 77 1 321e- 

14 22-36 


255 


PR00890 


SMOOTH MUSCLE PROTEIN 22- 
ALPHA (TRANSGELIN) 
MLriN A 1 UKc. 


PR00890E 1434 1.429e-27 136-156 
PR00890A8.61 1.000e-26 34-54 PR00890C 

R 97 1 60ftp-1Q R5-QQ PPftfiROOR R 75 

6318e-19 62-78 PR00890F 12.92 1.205e-17 
167-175 PRflOROOD 16 17 1 130e-13 119- 
17Q 


257 


BL00745 


Prokaryotic-type class I peptide chain 
release factors signat. 


BL00745C 13.66 1.000e-40 202-249 

RTf)fi745R 77 56 R 6R^pJ*3 14R-101 

DLaJ\J /HDD JO O.UOjC-jj ItO-l/l 

BL00745D 14.90 8.435e-23 280-303 


Zoy 


tit nmo>i 


i nioreaoxin iamuy proieins. 


RT /Ifll 04 1 7 16 7 47Qp-1 0 6RA-697 


260 


BL00612 


Osteonectin domain proteins. 


BL00612E 13.12 3.948e-10 391-436 


2oU 


r>r r\r\A o A 

oLOU4o4 


Thyroglobulin type-1 repeat proteins 
proteins. 


dt f\(\AQAn 1 7 01 R 744p.1 1 1 36-1 51 
Rl" j00484R 0 OA 2 145e-10 249-263 
BL0Q484C 17 01 2 309e-09 269-284 
BL00484B 9.04 8.950e-09 116-130 


767 


pp nm R7 


UNA T PPOTFTM FAMTT Y 

SIGNATURE 


PR00187A 12 84 2 375e-09 288-308 


262 


BL00198 


Nt-dnaJ domain proteins. 


BL00198A 8.07 3.681e-09 292-309 


7£7 


dt nm 57 


Aminrtfrancfpracpc r-locc \[ nvn/invol. 
/\II]]nOuaIlSICIaSCS ClaoS- V pynUU AiU- 

phosphate attachment site proteins. 


BTj00157A 11 72 8 2G0e-09 16-26 

DLAJV1J / A ll./A O^rWC-V/ UTAV 




PP lVM7fl 


Ct PPOTPTM RFTA WD-40 pfpfat 
SIGNATURE 


PR00320B 12 19 2 125e-09 207-222 


76^ 
zoo 


ppnnoi ^ 
rr\j\jyio 


T"V-i rr»ci-ri r\cnmp variant oiiffn r*P 
1 iypdllUoUIIlC VdTlnlil oUlJLavC 

glycoprotein. 


PF00913A 7 33 2 500e-09 666-673 


266 


BL01144 


Ribosomal protein L31e proteins. 


BL01 144 25.07 1.000e-40 21-73 


7£R 


TWA ftfK 1 4\ 


i R6 nT^pnrnTKi t "M-Tppmtmat 

loO U lo v-fVjJXyliN 1 IN ~ 1 JZ/I\JVili> /VL*. 


DM0051fi 30 53 8 168e-13 153-198 


268 


BL00132 


Zinc carboxypeptidases, zinc-binding 
region x proicuus. 


BL00132C 21.35 7.863e-10 307-348 
BL00132A 26 07 8 988e-10 224-265 


268 


PR00765 


CARBOXYPEPTIDASE A 
METALLOPROTEASE (M14) 
FAMILY SIGNATURE 


PR00765B 15.57 7.17U-12 276-291 
PR00765D 14.16 1.551e-09 420-434 


ZOo 


pr nni7A 


^yciopniiin-iype pcpuuy l-proiyi cib- 


RIJ00170A 17 OR 9 018e-09 485-512 


269 


BL00622 


Bacterial regulatory proteins, luxR 
family proteins. 


BL00622 32.69 9.780e-09 11-58 


270 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 1.000e-ll 447-461 
PR00048A 10.52 4.316e-ll 389-403 
PR00048A 10.52 6.684e-ll 362-376 


270 


PF00651 


BTB (also known as BR-C/Ttk) domain 
proteins. 


PF00651 15.00 3.143e-10 37-50 


270 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 7.000e-10 392-409 BL00028 
16.07 9.100e-10 256-273 BL00028 16.07 
2.286e-09 450-467 BL00028 16.07 8.714e- 
09 365-382 


274 


DM00303 


6 LEA 1 1-MER REPEAT REPEAT. 


DM00303A 1320 3310e-O9 467-517 


275 


PF00622 


Domain in SPla and the RYanodine 


PF00622B 21.00 9357e-14 374-396 
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♦Results 






Receptor. 


PF00622C 12.62 1.857e- 12 458-472 


275 


BLG0518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


BL00518 12.23 8.800e-ll 44-53 


277 


PF00651 


BTB (also known as BR-C/Ttk) domain 
proteins. 


rrootoi 13.00 y.ijje-io to- /o 


278 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 8.200e-16 295-308 PD00066 
13.92 8.200e-16 519-532 PD00066 1352 
1.692e-15 351-364 PD00066 13.92 4.462e- 
15 547-122 PD00066 13.92 4.600e-14 323- 
336 PDO0066 13.92 4.600e-14 435-448 
PD00066 13.92 7.000e-14 463-476 PD00066 
13.92 1.500e-13 239-252 PD00066 13.92 
3.143e-12 267-280 PD00066 13.92 3.143e- 
12 407-420 PD00066 1352 8.826e-ll 211- 
224 PD00066 13.92 2.038e-10 491-504 
PD00066 13.92 2.385e-10 379-392 


278 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 1032 7.750e-16 444-458 
PR00048A 1032 6.727e-15 360-374 
PR00048A 10.52 9.182e-15 528-542 
PR00048A 1032 7.0008-14 472-486 
PR00048A 1032 7.750e-14 388^02 
PR00048A 1032 1.000e-13 332-346 
PR00048A 1032 3.133e-13 304-318 
PR00048A 1032 4.857e-13 118-132 
PR00048A 10.52 6.786e-13 500-514 
PR00048B 6.02 1.000e-12 292-302 
PR00048A 1032 8.941e-12 192-206 
PR00048B 6.02 1.000e-ll 348-358 
PR0004eA 10.52 l.y47e-ll 24o-ZoZ 
PR00048B 6.02 2.385e-ll 264-274 
rK0U04oc 0.02 7.231e-ll 544-110 
PR00048A 10.52 7.632e-ll 416-430 
PR00048B 6.02 8.615e-ll 236-246 
PR00048B 6.02 2.688e-10 516-526 
PR00048B 6.02 4.375e-10 460-470 

PPflHfMOp (\ fY> A in Afifi AQfi 
rtCUUlWoll 0,\JjL 4.J / De-lU 4oo-*f5'o 

PR00048B 6.02 4.938e-10 404-414 
PR00048B 6.02 6.063e-10 320-330 
PR00048A 1032 7.214e-10 220-234 
PR00048B 6.02 1.947e-09 432-442 
PR00048B 6.02 4.316e-09 572-144 


278 


DM01970 


0kwZK632.12YDR313C 
ENDOSOMALm. 


DM01970B 8.60 5.012e-09 191-204 


279 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 6.400e-16 449-462 PD00066 
13.92 6338e-15 504-517 PD00066 13.92 
9.308e-1542M34 PD00066 13.92 7.000e- 
14 476-489 PD00066 1352 6.087e-ll 393- 
406 


279 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 2.500e-17 350-367 BL00028 
16.07 5.050e-13 405-422 BL00028 16.07 
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9. 171e-12 433-450 BL00028 16.07 2.731e- 
11 








488-505 BL00028 16.07 3.077e-ll 516-533 

£>JLUUU/o lO.U / O. lUUc-lU J 1 


1*7 A 

279 


rUU24o2 


REGULATION AC. 


'DTXYlAA'J A 07 AQ < ytOQa_AA CK 

rJJUZ40ZA zz.4o o.4ooe-uy *to 10 10 


*>*7A 

279 


PKU0048 


SIGNATURE 


BDAArt/ICA 1A CO 1 KA~ 1AH 

rKUUU4oA IU.jz 3.2jUe-Io J4/-J01 
PR00048B 6.02 5.154e-ll 501-511 
PR00048B 6.02 1.000e-10 446^56 
PR00048A 10.52 1.391e-10 513-527 
PR00048A 10.52 2.565e-10 485-499 
PR00048A 10.52 5.696e-10 402-416 
PR00048B 6.02 8.875e-10 418-428 
PR00048A 10.52 1.720e-0943(M44 
PR00048B 6.02 3.368e-09 39O400 

DDArtrt/lO A 1A d O 1AA« AQ 1*7 A 1QQ 

rKUlXHoA W.jZ o.200e-09 374ooo 


285 


BL00276 


Channel forming colicins proteins. 


BL00276A 8.87 6.500e-09 257-269 


286 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.000e-30 10-49 


286 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 6.400e-16 388-401 PD00066 
13.92 3.769e-15 248-261 PD00066 13.92 
9.308e-15 304-317 PD00066 13.92 2.200e- 
14 360-373 PD00066 13.92 2.200e-14 416- 
429 PD00066 13.92 6.400e-l 4 332-345 . 
PD00066 13.92 1.000e-13 220-233 PD00066 
13.92 2.5lWe-l i 192-205 FLWOUoo 13^2 
5.000e-13 276-289 PD00066 13.92 5.500e- 1 
09 136-149 


286 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 2J286e-16 260-277 BL00028 
16.07 2.588e-14 288-305 BL00028 16.07 
2.800e-13 400-417 BL00028 16.07 6.850e- 
13 120-137 BL00028 16.07 3.423e-ll 148- . 
165 BL00028 16.07 7.923e-ll 344-361 
BL00028 16.07 2.500e-10 204-221 BL00028 
16.07 2.500e-10 428-445 BL00028 16.07 j 
3.100e-10 316-333 BL00028 16.07 6.1 OOe- 
10 176-193 BLO0028 16.07 1.771e-09 232- 

0A.Q RT./MWWR IfitnS 9nr>p-flQ V7?-^RQ, 


286 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7.000e-17 257-271 
PR00048A 10.52 6.727e-15 397-411 
PR00048A 10.52 2.929e-13 285-299 
PR00048A 10.52 9.471e-12 369-383 
PR00048B 6.02 1.000e-ll 329-339 
PR00048A 10.52 1.474e-ll 313-327 
PR00048A 10.52 2.421e-li 425-439 
PR00048B 6.02 3.077e-ll 385-395 
PR00048A 10.52 6.684e-ll 117-131 
PR00048A 10.52 8.141e-ll 201-215 
PR00048A 10.52 1.783e-10 341-355 
PR00048B 6.02 2.125e-10 301-311 
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PR00048B 6.02 2.125e-10 357-367 
PR00048B 6.02 2.688e-10 217-227 
PR00048A 10.52 3.739e-10 229-243 
PR00048B 6.02 4.938e-10 273-283 
PR00048B 6.02 1.474e-09 245-255 
PR00048A 10.52 2.440e-09 145-159 
PR00048B 6.02 3.842e-09 161-171 
PR00048B 6.02 8.105e-09 441-451 
PR00048B 6.02 9.053e-09 189-199 


287 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.407e-23 3-42 


287 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 8.941e-14 269-286 BL00028 
16.07 1.000e-13 549-128 BL00028 16.07 
2.565e-12 194-650 BL00028 16.07 6.087e- 
12 241-258 BL00028 16.07 6.870e-12 297- 
314 BL00028 16.07 6.870e-12 381-398 
BL00028 16.07 7.214e-12 493-510 BL00028 
16.07 1.346e-ll 465-482 BL00028 16.07 
1.692e-ll 353-370 BL00028 16.07 3.769e- 
11 325-342 BL00028 16.07 6.192e-ll 167- 
622 BL00028 16.07 8.962e-ll 213-230 
BL00028 16.07 1.600e-10 409^26 BL00028 
16.07 5.200e-10 185-202 BL00028 16.07 
6.700e-10 577-156 BL00028 16.07 3.057e- 
09521-538 BL00028 16.07 6.143e-09 437- 
454 


287 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 9.250e-14 238-252 
PR00048A 10.52 3.209e-12 266-280 
PR00048A 10.52 4.706e-12 490-504 
PR00048A 10.52 5.765e-12 462-476 
PR00048A 10.52 7.882e-12 630-644 
PR00048A 10.52 8.941e-12 518-532 
PR00048A 10.52 9.471e-12 164^178 
PR00048A 10.52 5.737e-ll 378-392 
PR00048A 10.52 7.158e-ll 546-122 
PR00048B 6.02 7.231e-ll 180-190 
PR00048A 10.52 8.141e-ll 210-224 
PR00048A 10.52 9.053e-ll 294-308 
PR00048A 10.52 9.053e-ll 406-420 
PR00048A 10.52 3.348e-10 322-336 
PR00048B 6.02 3.813e-10 338-348 
PR00048B 6.02 3.813e-10 394-404 
PR00048B 6.02 3.8 13e-10 478-488 
PR00048B 6.02 4.938e-10 506-516 
PR00048A 10.52 8.043e-10 434-448 
PR00048B 6.02 8.875e-10 226-236 
PR00048B 6.02 8.875e-10 450-460 
PR00048B 6.02 l.OOOe-09 366-376 
PR00048B 6.02 l.OOOe-09 422-432 
PR00048A 10.52 3.520e-09 136-588 
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PR00048B 6.02 7.158e-09 590-600 
PR00048B 6.02 7.632e-09 310-320 
PR0G048B 6 02 7 632e-09 124-572 
PR00048A 10.52 9.280e-09 350-364 






nnrvriR ofot a tf p fdt tpt a w 

uin 1 uivurv/LiA 1 Hi x\jcaj\j\^xx\oxj 
SIGNATURE 


PR00070C 13 09 6 143e-16 51-63 
PR00070D 11.63 2.929e-15 112-127 


lev 




T\«V»» »/4«««4y^1 off* vaHii/^^qc^ Tvrr* f *» 1 n c 
XyliiyCUOlOlalC I CUUl/UlbC piUlCllio. 


BI 00075 A 27 70 7 900e-16 8-39 BL00075B 
13 49 3 813e-15 51-63 BL00075C 8.51 
2 862e-ll 66-79 BL00075D 5.74 8.105e-10 
113-123 


000 




FI IMG AT PHFP OMONF MATING 
FACTOR. STE2 GPCR SIGNATURE 


PR00250D 14 62 9.163e-09 254-278 


294 


PR00081 


GLUCOSE/RJB1TOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081A 10.53 2.731e-09 39-57 


904 


PROHOKO 

J IVwUl/Ol/ 


AT COHOL DEHYDROGENASE 

/VLv-'wi i. Vyi-j XJXjXX X XJXWJKJXjI ^/wJIj 

SUPERFAMELY SIGNATURE 


PR00080C 17.16 6.464e-ll 191-211 
PR00080A 9.32 9.750e-09 118-130 


295 


PR00806 


VINCULIN SIGNATURE 


PR00806B 4.28 8.920e-09 276-290 
PR008O6B 4.28 9.202e-09 275-289 


296 


PF00992 


Troponin. 


PF00992A 16.67 3.789e-10 553-588 ! 


296 


BL00752 


XPA protein. 


BL00752B 19.17 8.144e-09 130-612 




RT 01 ItfO 


tTin^cin liofit f»TifliTi rprvRflt wntfMTiQ 

XVJJXvoLlX llgXIl WXlcLXli lC-JJCa.1 piLHt>iilO- 


BL01 160B 19 54 8.551e-09 536-590 


298 


PR00511 


TEKTIN SIGNATURE 


PR0051 lC 7.86 4.2l4e-09 371-388 




x31A»U.>jj 


XliVAVJl/Z pxULClUa* 


BI/)0353B 11 479 171e-19 228-278 


301 


PR00240 


ALPHA-1 A ADRENERGIC 
BFCFPTOR WrNATIIRF 


PR00240C 8.38 3.941e-10 316-336 


302 


BL00518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


BL00518 12.23 2.200e-ll 54-63 


302 


BL01282 


BIR repeat proteins. 


BL01282B 30.49 2.029e-09 35-74 


305 


TjT>/\A1 Ol 


h/fVTkQTM UTiAW pUATM 

SIGNATURE 


xx\\j\JxzfjxJ it.jU LJ^JC-Jl jy\f^*rxZ/ 

PR00193C 12.60 1^09e-25 143-171 
PR00193B 1 1.69 2.543e-24 95-121 
PR00193A 15.41 6.885e-19 39-59 
PR00193E 19.47 3.291e-12 444-473 


305 


I3lA/(Jo75 


Sigma-54 interaction domain proteins 
ATP-binding region A proteins. 




306 


PR00239 


MOLLUSCAN RH0D0PS1N C- 
TFRMINAL TATT SIGNATURE 


PR00239E 1.58 5.920e-ll 47-59 


306 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 7.923e-15 140-153 PD00066 
13.92 4.000e-14 112-125 PD00066 13.92 
1.391e-ll 84-97 PD00066 1352 1.692e-10 
168-181 


306 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 2.059e-14 96-1 13 BL00028 
16.07 4.130e-12 124-141 BL00028 16.07 
2.385e-ll 68-85 BL00028 16.07 8269e-ll 
180-197 BL00028 16.07 8.962e-ll 152-169 
BL00028 16.07 9.400e-10 319-336 


306 


PR00799 


ASPARTATE 

AMINOTRANSFERASE 

SIGNATURE 


PR00799D 16.46 5.125e-09 188-214 
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306 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048B 6.02 1.900e-13 81-91 PR00048A 
10.52 3.133e-13 65-79 PR00048A 10.52 
9J57e-13 121-135 PR00048A 10.52 9.357e- 
11 14Q-M53 PR00048B 6 02 2 68 Re- 10 137- 
147 PR00048A 10.52 4.522e-10 279-293 

PR00048B 6.02 9.438e-10 109-119 
PR00048A 10 52 3 160e-09 93-107 
PR00048B 6.02 8.105e-09 165-175 


307 


PD00015 


GLYCOPROTEIN PRECURSOR 
CELL SL 


PD00015A 8.90 6.400e-09 35-43 


310 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM0003 IB 15.41 3.662e-ll 80-114 


311 


BL00824 


Elongation factor 1 beta/betaVdelta 
chain proteins. 


BL00824C 14.58 1.000e-40 129-167 

TIT rtflRO/Tn 1 A (\A £ 1 Qlf* 10 1 ffl ocvy 
d\Jj\joJ/\\J Ih.Uh O.I7ZCO7 10/-ZUZ 

BL00824B 9.21 2.080e-21 96-116 
BL00824E 12.49 3.333e-19 210-226 


312 


PR00501 


KELCH REPEAT SIGNATURE 


PR00501B 18.88 7.632e-09 476-491 
PR00501B 18.88 9.763e-09 523-538 


313 


PD01066 


PROTEIN ZINC FINGER Z3NC- 
FTNGER METAL-BINDING NU. 


PD01066 19.43 6.200e-30 43-82 


313 


PD00066 


PROTEIN ZINC-FINGER METAL-. 

T1TVTTM 

BIND! 


PD00066 13.92 6.500e-13 439-452 PD00066 

11 o AAA- to Of c o/co nnAAAifi: 10 no 

13.92 o.UOOe-13 355-3oo PD0U066 U.yZ 
1.000e-12 383-396 PD0O066 13.92 4.000e- 

12 327-340 PD00066 13.92 5.714e-12 411- 
424 PD00066 13.92 8.435e-ll 299-31213.92 
5.800e-14 467-480 PD00066 


313 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 2.565e-12 451-468 BL00028 
16.07 2.957e-12 311-328 BL00028 16.07 
3.348e-12 367-384 BL00028 16.07 1.692e- 
11423-440 BL00028 16.07 2.731e-l 1283- 
300 BL00028 16.07 2.800e-10 339-356 
BL00028 16.07 9.700e-10 199-216 BL00028 
16.07 1.000e-09 395-412 BL00028 16.07 


313 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 5.909e-15 364-378 
PR00048A 10.52 2.286e-13 308-322 

rKUUVMtoA lU.DZ /.hZ^C-1j J^Z^HJO 

PR00048A 10 52 6 824e-12 448-462 
PR00048A 10.52 2.421e-ll 196-210 
PR00048A 10.52 1.000e-10 280-294 
PR00048B 6.02 3.813e-10 324-334 
PR00048B 6.02 4.375e-10 464474 
PR00048A 10.52 6.870e-10 336-350 
PR00048A 10.52 7.214e-10 420-434 
PR00048B 6.02 7.750e-10 436446 
PR00048B 6.02 4.316e-09 380-390 


314 


PR00121 


SODIUM/POTASSIUM- 
TRANSPORTING ATPASE 
SIGNATURE 


PR00121D 16.72 1.577e-13 210-232 


314 


PR00119 


P-TYPE CATION-TRANSPORTING 


PR00119B 13.94 9. 194e-12 217-232 
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JKcSUUS 






AlrASc oUrrUtrAmlLi 
SIGNATURE 




314 


BL01228 


Hypothetical cof family proteins. 


BL01228D 17.44 3.400e-ll 646-671 


314 


BL00154 


E1-E2 ATPases phosphorylation site 
proteins. 


T%T AA1 C AT? OA O H A f\CA** 1 1 AOC 0*7 

BJJU0154E 20.37 4.U54e-l J 4oo-527 
BL00154C 12.38 4:060e-12 213-232 

DT AA1 CAT? 0 0^ O <0'7« 1 1 OAT ££G 


315 


BL00888 


Cyclic nucleotide-binding domain 
proteins. 


BL00888B 14.79 1.692e-10 396^20 


315 


BL00420 


Speract receptor repeat proteins domain 
proteins. 


BL00420A 20.42 8.338e-09 215-682 


315 


DM00668 


ZEIN. 


DM00668A 10.20 8.500e-09 155-170 


316 


PR00727 


BACTERIAL LEADER PEPTIDASE 1 
(S26) FAMILY SIGNATURE 


PR00727C 13.04 9.063e-16 108-128 
PR00727B 12.51 7.848e-ll 81-94 


316 


BL00501 


Signal peptidases I serine proteins. 


BL00501D 16.69 2.884e-13 108-128 
BL00501C9.61 9.561e-l l 81-93 BL00501B 
12.58 7.000e-09 61-77 


317 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 9.471e-27 13-52 


317 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 5.235e-14 214-231 BL00028 
16.07 6.850e-13 270-287 BL00028 16.07 
9.100e-13 354-371 BL00028 16.07 1.391e- 
12 158-175 BL00028 16.07 1.346e-ll 298- 
315 BL00028 16.07 3.769e-ll 242-259 
BL00028 16.07 6.538e-ll 380-397 BL00028 
16.07 8.800e-10 186-203 BL00028 16.07 
L514e-09 326-343 


317 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048B 6.02 3.000e-12 199-209 
PR00048A 10.52 7.882e-12 351-365 
PR00048A 10.52 8.412e-12 323-337 
PR00048A 10.52 8.941e-12 239-253 
PR00048A 10.52 1.474e-ll 211-225 
PR00048A 10.52 6.211e-ll 155-169 
PR00048B 6.02 7.231e-ll 311-321 
PR00048A 10.52 8.141e-ll 267-281 
PR00048B 6.02 3 .250e-l0 339-349 
PR00048B 6.02 3.8l3e-l0 255-265 

t»t» Ann a or> az no *7 100. in ooi oni 

PR00048B 6.02 7.188e-10 283-293 

PR00048B 6.02 3.842e-09 393-403 
PR00048A 10.52 8.200e-09 295-309 


319 


PR00004 


ANAPHYLATOXIN DOMAIN 
SIGNATURE 


PR00004C 12.46 8.141e-09 91-103 


320 


DM00060 


338 kwNEUREXIN ALPHA ffl 
CYSTEINE. 


DM00060 6.92 6.500e-U 28-38 


320 


PR00010 


TYPE H EGF-LKE SIGNATURE 


PR00010C 11.16 7.667e-ll 44-55 


325 


PR00020 


MAM DOMAIN SIGNATURE 


PR00020A 18.17 5.776e-12 344-363 
PR00020C 13.66 6.932e-10 417-429 


325 


BL00740 


MAM domain proteins. 


BLO074OA 13.87 8.3 13e-12 346-359 1 
BL00740B 19.76 8.500e-09 486-507 


325 


PD02080 


T-CELL GLYCOPROTEIN CD8 


PD02080B 20.69 9.621e-09 123-162 
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CHAIN SURFACE ALPHA PRE. 




326 


BL00048 


Protamine PI proteins. 


BL00048 6.39 6.128e-10 167-194 


326 


PF01140 


Matrix protein (MA), plS. 


PF01140D 15.54 9.791e-09 220-255 


327 


PR00020 


MAM DOMAIN SIGNATURE 


PR00020C 13.66 2.615e-ll 143-593 
PR00020B 15.52 5.059e-10 52-69 
PR00020B 15.52 1.789e-09 553-132 


329 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 9.357e-32 8-47 


329 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BLQ0028 16.07 3.209e-14 284-301 BL00O28 
16.07 4.600e-13 508-525 BL00028 16.07 
6.400e-13 368-385 BL00028 16.07 4.1 15e- 
11 396-413 BL00028 16.07 4.1 15e-ll 424- 
441 BL00028 16.07 8.269e-l 1 172-189 
BL00028 16.07 8.962e-ll 256-273 BL00028 
16.07 9.308e-ll 312-329 BL0O028 16.07 
9.654e-ll 200-217 BL00028 16.07 3.100e- 
10 340-357 BL00028 16.07 5.500e-10 452- 
469 BL00028 16.07 9.100e-10 480-497 
BL00028 16.07 4.086e-09 228-245 


329 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 7.000e-14 272-285 PD00066 
13.92 5.000e-13 328-341 PD00066 13.92 
5.500e-13 188-201 PD00066 13.92 5.500e- 
13 384-397 PD00066 13.92 6.000e-13 496- 
509 PD00066 13.92 6.143e-12 468^81 
PD00066 13.92 2.731e-10 440-453 PD00066 
13.92 4.808e-10 160-173 PD00066 13.92 
5.500e-10 244-257 PD00066 13.92 7.000e- 
09 216-229 PD00066 13.92 7.000e-09 412- 
425 


332 


PD02870 


RECEPTOR INTERLEUKIN-1 
PRECURSOR 


PD02870B 18.83 5.871e-ll 468-501 


332 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 8.043e-10 275-289 


332 


BL00240 


Receptor tyrosine kinase class m 
proteins. 


BL00240B 24.70 4.447e-09 43(M54 


333 


BL00738 


S-adenosyl-L-homocysteine hydrolase 
proteins. 


BL00738J 18.61 l.OOOe-40 154-204 
BL00738H 23.08 5.320e-36 468-521 

DlMU/ior IZU.5 /.Z0ie-/f JoHly 

BLO0738A 16.27 9.660e-27 216-256 
BLO0738C 16.53 7.923e-25 281-319 
BL00738G 14.29 6.268e-23 446-468 
BL00738B 12.28 8.085e-21 256-281 
BL00738E 14.18 9.200e-19 361-384 
BLO07381 14.57 5.135e-17 545-583 
BL00738D 7.16 5.109e-13 335-350 


333 


BL00836 


Alanine dehydrogenase & pyridine 
nucleotide transhydrogenase. 


BL00836D 22.30 8.622e-09 424-461 


337 


PR00425 


BRADYKININ RECEPTOR 
SIGNATURE 


PR00425C 13.23 3.148e-09 80-100 


342 


PD01823 


PROTEIN INTERGENIC REGION 


PD01823E 9.30 6.824e-12 108-121 
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ABC1 PRECURSOR 
MITOCHONDRION T. 


PD01823D 16.66 1.265e-09 46-67 




PR00976 


RIBOSOMAL PROTEIN S21 
FAMILY SIGNATURE. 


PR00976C 10.41 2.837e-09 396-407 


'lA'l 




PROLINE-RICH PROTEIN 3 


DM00215 19.43 1.458e-09 473-506 
DM00215 19.43 4.8 14e-09 463^196 


343 


PR00671 


INHIBIN BETA B CHAIN 


PR00671C 4.18 9.172e-09 707-727 


343 


PD01234 


PROTEIN NUCLEAR 

JJ XV v-/ 1V1 Wi-' \J IVL/AJIN 1 JvrYTN i3. 


PD01234B 15.53 L000e-08 482-500 


344 


PR00175 


MYOGLOBIN SIGNATURE 


PR00175B 9.02 2.143e-10 25^9 


1AA 




SIGNATURE 


PRflftR14C 9.20 6 523e-10 66-84 


344 


PR00173 


ERYTHROCRUORIN FAMILY 

C1YTWA TT TP P 


PR00173A 15.91 7.158e-10 25^8 


344 


BL01033 


Globins profile. 


BL01033A 16.94 1.000e-1625-47 


344 


PR00612 


ALPHA HAEMOGLOBIN 
SIGNATURE 


PR00612E 9.04 4.194e-12 122-139 
PR00612B 10.92 3.483e-10 32-43 

PPftA£1 2H 0 7^0 A^R^_nQ 1A-RR 


345 


PR00814 


BETA HAEMOGLOBIN 
SIGNATURE 


PR00814C 920 6.523e-10 104-122 


345 


BL01033 


Globins profile. 


BL01033A 16.94 5.125e-10 63-85 
BL01033B 13.81 8.615e-09 125-137 


345 


PR00612 


ALPHA HAEMOGLOBIN 
SIGNATURE 


PR00612E 9.044.194e-12 160-177 
PR00612B 10.92 3.483e-1070-81 
PR00612D 9.76 9.438e-09 112-126 


349 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.133e-32 645 


350 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 
family 2 proteins. 


BL00972A 11.93 6.318e-19 364-382 . 

DT AAQ'70'n O A* 1 Aftftp. 1 0 AAZ AZZ 


350 


PR00049 


WILMS TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.008e-13 121-136 
PR00049D 0.00 7.375e-12 125-140 
PR00049D0.005.916e-ll 128-143 
ppnnnAon n nn 6 7ARp.i i i 22.1 vi 
ppnnfUQri ft nn q 1 126-141 
PR00049D 0.00 1.286e-10 119-134 
PR00049D 0.00 8.929e-10 127-142 
PR00049D 0.00 2.678e-09 129-144 
PR00049D 0.004.051e-09 123-138 
PR00049D 0.00 4.051e-09 124-139 
PR00049D 0.00 4.051e-09 130-145 


350 


PR00211 


GLUTELIN SIGNATURE 


PR00211B 0.86 7.500e-09 124-145 


350 


DM00215 


PJtOLINE-RICH PROTEIN 3. 


DM00215 19.43 5.339e-10 108-141 
DM00215 19.43 7.268e-10 112-145 
DM00215 19.43 2.525e-09 106-139 
DM00215 19.43 9.695e-09 107-140 


350 


BL00048 


Protamine PI proteins. 


BL00048 6.39 9.888e-09 145-172 


352 


BL00518 


Zinc finger, C3HC4 type (RING 


BL00518 12 23 4.429e- 10 214-223 
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finger), proteins. 




jjj 


or fifKIR 
DLUUJ 1 0 


7inr Antrpr PWOl tvne (RING 

finger), proteins. 


BL00518 12 23 4 429e-10 179-188 


354 


BL01009 


Extracellular proteins SCPflpx- 
1/Ag5/PR-1/Sc7 proteins. 


BL01009D 14.19 9.341e-17 160-181 
BL01009A 13.75 3.769e-l4 80-98 
RTAIOnOP I** Sft 5 194-210 

BL01009C 10.54 2.667e-ll 127-141 


354 


PR00838 


VENOM ALLERGEN 5 SIGNATURE 


PR00838G 16.07 2.304e-14 158-178 
PR00838D 8.73 4.452e-12 80-99 PR00838F 

in i 1 7 in i 141 
iu.ii /.J3ze-iu izj-i*h 


354 


PR00837 


ALLERGEN V5/TPX-1 FAMILY 
MUlNAlUKc 


PR00837C 17.21 7.429e-18 159-176 

PPHHR^TTi 11 19 9 IQR^-1^ 1QS-90Q 
PRr10jtt7R 1 1 64 1 45tte-09 127-141 


356 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 8.500e-17 16-41 
RT1M31STI 10444 900e-09 177-190 
BL00215A 15.82 6.786e-09 133-158 
BL00215B 10.44 7.300e-09 278-291 






xyrrrnpwoMnpTAT pahrtpr 

1V1A 1 WIN 1JI\JJ\Jlj V//ViVI\ J.CJv 

PROTEIN SIGNATURE 


PR00996F 11 70 6 049e-13 91-110 
PR00926F 17 75 7 600e-ll 240-263 
PR00926F 17.75 5.219e-10 18-41 PR00926D 
10.53 7.387e-09 246-265 


357 


PR00326 


GTP1/OBG GIT-BINDING PROTEIN 
FAMILY SIGNATURE 


PR00326A 8.75 7.150e-ll 2M2 


357 


BL00113 


Adenylate kinase proteins. 


BL00113A 12.74 6.677e-09 22-39 


357 


BL01128 


Shikirnate kinase proteins. 


BL01 128A 18.84 7.802e-09 21-55 


357 


BL00300 


SRP54-type proteins GTP-binding 
domain proteins. 


BL00300B 20.56 1.000e-08 18-64 


J Jo 




TTHirmirin rariirwvl -terminal hvdrolases 

family 2 proteins. 


BL00972A 11 93 6 318e-19 324-342 
BL00972D 22.55 3.903e-16 170-194 
BL00972B 9.45 1.600e-12 405-415 


364 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 1.482e-10 355-388 


364 


PR00217 


43 KD POSTSYNAPTIC PROTEIN 


PR00217C 10.91 4.600e-10 302-318 


365 


BL00518 


Zinc linger, C3HC4 type (RING 

finopr^ nmfpinc 


BL00518 12.23 2.800e-ll 125-134 


365 


BL00415 


Synapsins proteins. 


BL00415N 4.29 2.839e-09 387-431 


365 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 7.706e-ll 377-410 
DM00215 19.43 8.412e-ll 333-366 
DM00215 19.43 2.678e-09 356-389 
DM00215 19.43 5.138e-09 376-409 


365 


BL01102 


Prokaryotic dksA/traR C4-type zinc 
finger. 


BL01102 15.99 5.705e-O9 109-135 


365 


PR00211 


GLUTEUN SIGNATURE 


PR00211B 0.86 5.959e-ll 407-428 
PR0021 IB 0.86 2.212e-10 401-422 
PR0021 IB 0.86 9.500e-09 336-357 


365 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.695e-09 335-350 


367 


PD00126 


PROTEIN REPEAT DOMAIN TPR 
NUCLEA. 


PD00126A 2233 8.448e-09 2-23 
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uescnpiion 


IV. Caul la 


370 


BL00028 


Zinc finger, C2H2 type, domain 
proteins.. 


BL00028 16.07 7.353e-14 157-174 BL00028 
16.07 1.000e-13 269-286 BL00028 16.07 
8.200e-13 493-510 BL00028 16.07 3.739e- 
12 213-230 BL00028 16.07 6.478e«12 381- 
398 BL00028 16.07 1.346e-li 185-202 
BL00028 16.07 2.385e-ll 129-146 BL00028 
16.07 2.385e-ll 325-342 BL00028 16.07 
5.154e-l 1241-258 BL00028 16.07 9.654e- 
11 437-454 BL00028 16.07 1.300e-10 297- 

•J 1/1 t>T A/WIG t& fY7 O IHAo-IA 4AG_dO/» 

BL00028 16.07 9.100e-10 465^82 


370 


PD00066 


rRUlblN ZlNC-rlNljrfcKMfclAL- 
BINDL 


DT^flAA/;/! 1 1 OO O lC^a 1 C OOO O/IO PTIftHfi^/v 

13.92 3.077e-15 145-158 PD00066 13.92 
8.800e-14 173-186 PD00066 13.92 3.500e- 
13 369-382 PD00066 13.92 8.500e-13 341- 
354 PD00066 13.92 9.133e-12 397-410 
PD00066 13.92 2.174e-ll 313-326 PD00066 
13.92 3.348e-ll 453-466 PD00066 13.92 
3.739e-l 1481-494 PD00066 13.92 7.2 14e- 
11 257-270 PD00066 13.92 2.038e-10 425- 
438 PD00066 13.92 6.538e-10 201-214 
PD00066 13.92 5.200e-09 285-298 


370 


DM01970 


0kwZK632.12YDR313C 
ENDOSOMAL IE. 


DM01970B 8.60 6.201e-09 265-278 


370 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 1.474e-ll 462^76 

im AAA A O A 1 A O £. CO A** 111 OO 1 C\C 

PR00048A 10.52 o.684e-ll 182-196 
PR00048A 10.52 2.957e-10 434-448 

dd c\c\(\a en a ao < <nn A 1A mo 1/IQ 

PR00048A 10.52 6.478e-10 350-364 

t>p nnnAftn & no & i R7*» in oo£j?3£ 
i^jvUUtMroU o.uz o.io/e-iu zzo-zoo 

PR00048A 10.52 6.870e-10 490-504 

PROMURA 10 52 8 R26e-t0 406-420 

PR00048B 6.02 3.842e-09 170-180 

PR00048B 6 02 4 3 1 6e-09 366-376 

PR00048B 6.02 4.789e-09 478^88 

PR00048B 6.02 7.632e-09 142-152 

PR00048A 10 52 8 122e-09 126-140 

PR00048B 6.02 9.053e-09 450-460 


371 


BL01019 


ADP-ribosylation factors family 
proteins. 


BL01019B 19.49 6.276e-21 95-150 
BL01019A 13.20 8.453e-17 51-91 


371 


PR00328 


GTP-BDMDING SARI PROTEIN 
SIGNATURE 


PR00328C 13.16 8.481e-13 78-104 
PR00328D 12.56 3357e-ll 123-145 


371 


BL01115 


GTP-binding nuclear protein ran 
proteins. 


BL01115A 10.22 8.119e-ll 21-65 


373 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 4.522e-12 208-225 


373 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 7.000e-13 194-207 PD00066 
13.92 7.000e-13 224-237 PD00066 13.92 
7.000e-12 254-267 


373 


PR00048 


C2H2-TYPE ZINC FINGER 


PR00048A 10.52 1.391e-10 205-219 
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SIGNATURE 


PR00048B 6.02 6.063e-10 221-231 


374 

J /"T 


PR00308 


TYPE I ANTIFREEZE PROTEIN 
SIGNATURE 


PR00308A 5.90 7.288e-ll 533-548 
PR00308A 5.90 8.835e-09 534-549 


377 
j / / 


PD02784 


PROTEIN NUCLEAR 
PJ03ONUCLEOPROTETN. 


PD02784B 26 46 7 538e-09 147-190 


j to 


PD01351 


PROTFIN REPEAT 

NFI JROFH AMENT TRIPL 


PD01351A 8 69 7 469e-09 155-166 


380 


PF00094 


von Willebrand factor type D domain 
proteins. 


PF00094C 12.88 1.918e-09 43-53 


380 


BL01208 


VWFC domain proteins. 


BL01208B 15.83 3.667e-ll 120-135 
BL01208B 15.83 1.973e-09 178-193 


joU 


rUUzl jo 


x x\x^^Ux\jSWIv VJlw I KAJr ISXJ l GUN 

SIGNAL CELL. 


PT)ft7 l^ftA 77 6ft Q fK7f*-ftQ 9ft-f*Q 


1ft V 
JOl 


r>T A1 | AC 

DlAji IUj 


Kiuosomai proiein lodac proicirLs. 


THAI IfKR 17 7 qiiv 11 


384 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.205e- 10 10-25 PR00049D 
0.00 1.915e-09 9-24 


lor 


nr r\i lie 


GTP-binding nuclear protein ran 
proteins. 


jdlui i OA 10.// o.yuye-ij .54-/0 


385 


BL00905 


GTP1/OBG family proteins. 


BL00905D 15.00 5.3 13e-09 140-155 


JO J 


DP/V1/L4Q 


IxVftiNoP^iVlYLuNVJ x^ivWlxlxlN x Zl 

RAS SIGNATURE 


PR00449A 13.20 1.000e-17 34-56 
PPfifV14QD 1ft 79 3 368e-t3 139-153 
PR00449B 14 34 8 364e-ll 57-74 PR00449E 
13.50 8.286e-09 174-197 


386 

JOU 


RIjft0115 

JJXvVUl 4. J 


FnlrjiTvriti<* nnlvTOPracp TT 

iwUIull VUUv Xviiii 11 \J* VXXXwl 03v XX 

heptapeptide repeat proteins. 


BL001 1 57 3 12 7 977e-l 0 397-446 




PR00041 

x ivwvrt x 


CAMP RESPONSF FI EMFNT 
BINDING fCREB i PROTEIN 
SIGNATURE 


PR00041F 8 53 9 365e-09 256-274 

XrXX.VA/vrf XX O.J J 7iJUJv^/7 AJV**/"? 


388 


PF00646 


F-box domain proteins. 


PF00646A 14.37 9.036e-10 28-42 


389 


BL00036 


bZIP transcription factors basic domain 
proteins. 


BL00036 9.02 6.294e-12 81-94 


389 


PR00042 


FOS TRANSFORMING PROTEIN 
SIGNATURE 


PR00042C 8.29 8 105e-13 82-99 PR00042D 
8.97 9.895e-10 100-122 


389 

JO/ 


BL00224 


dafhrin lioht chain nrntein^ 


BL00224B 1654 3 373e-09 70-123 


389 


PR00043 


JUN TRANSCRIPTION FACTOR 
SIGNATURE 


PR00043B 8.73 9.596e-09 81-98 


390 


PF00622 


Domain in SPla and the Ryanodine 
Receptor. 


PF00622B 21.00 2.500e-13 85-107 


391 


BL00564 


Argininosuccinate synthase proteins. 


BL00564A 19.93 6.114e-09 7^4 


392 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7.750e-14 230-244 
PR00048A 10.52 4.316e-ll 202-216 


392 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 2.125e-15 205-222 BL00028 
16.07 1.391e-12 233-250 BL00028 16.07 
3.400e-10 177-194 


392 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 3.000e-13 193-206 PD00066 
13.92 3.423e-10 221-234 


393 


PF00622 


Domain in SPla and the RYanodine 
Receptor. 


PF00622B 21.00 1.391e-16 132-154 
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393 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 8.800e-10 761-778 BL00028 
16.07 2.029e-09 789-806 


7 A7 


nn AAA/1 Q 


pom TVPT7 71X1^ TTTKTfrFP 
l^Zxlx-x I r£ rUNVXEls. 

SIGNATURE 


PR0fift4RA 10 12 7 R00e-09 758-772 


394 


DDHACA1 


TACT m PTJPT* AT ClfTM A TT TOP 


PBfWHfil A R 9S 1 xtflQp-HQ S^7-SS1 


394 


DM00099 


4 kw A55R REDUCTASE 

1 JaJKJVLUN Al* JJlxl i iJKUx 1 ClvliJJiNIl 


DM00099B 14.73 4.375e-09 415-425 


395 


PR00399 


SYNAPTOTAGMIN SIGNATURE 


PR00399A 932 3.133e-19 146-162 

PR 001000 17 R7 ft 700«»-l 7 777-738 

PR00399B 14.27 7.750e-16 161-175 
PR00399D 14.48 4.000e-14 242-253 


395 


PR00360 


T\/'MV /C a TXT QT/^XT A TT TD T7 

C2 DUMAUN olONAl UKli 


PPftrft/tfiR HAI C 9AQ<* 17 90,1 _91^ 
xxVUUjOUxj 1 J.Ol o.ZOyC" 1 J 

PR00360A 14.59 2.800e-12 174-187 
PR00360B 13.61 5.217e-12 340-354 

PPfWi7£fiA 14 ^ 9ft7*»-in 71 1-194 


one 

395 


H1JAA1 gTQ 

PrUOloo 


C2 domain proteins. 


PlTflfil fSIC 1 97 4Q ^ ^OAo 1R 797-140 

PF00168B 11.83 2.000e-09 306-317 


396 


TJT /\* rtl O 

BL01013 


Oxysterol-binding protein family 
proteins. 


P*T ft1/l17A 9^ U7 971 r» 91 SfR.l^fi 

RT/iini7p» ii 77 i nnnp-11 irs-iqg 

Oi^lUljD 11.JJ 1.UUW11 !OJ*17U 


7Q£ 


P17AA*701 
rrUU/yi 


uomam presem in i oiiu ijiiuj-iiitc 
netrin receptors. 


PFOOTQIR 9R 49 1 5l4e-10 52-107 


396 


PD00078 


REPEAT PROTEIN ANK NUCLEAR 

AlNrv. 1 Xv. 


PD00078B 13.14 9.000e-ll 173-186 
PD00078R 13 14 3 739e-09 78-91 

X X-/\J\J\J 1 Oil XJ.X*t J. / J7C"V7 /O ^X 

PD00078B 13.14 4. 130e-09 45-58 




ppfiAA97 


auk repeat piuiviiib. 


PF0009TO 14 20 3 077e-ll 48-58 PF00023B 

X X \j\J\JS-J1lJ 1*T^iV JiV/ /V*ll TO X X UUULJiJ 

14.20 3.769e-l 1 176-186 PF00023A 16.03 
7 429e-09 85-101 


397 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 1.750e-10 55-71 


397 


PF00791 


Domain present in ZO-1 and Unc5-like 
netrin receptors. 


PF00791B 28.49 4.455e-ll 55-1 10 
PF00791B 28.49 7.291e-10 88-143 


398 


BL00422 


Granins proteins. 


BL00422C 16.18 5.787e-10 134-162 


400 


PR00450 


RECOVERIN FAMILY SIGNATURE 


PR00450D 16.58 8.986e-ll 161-181 


400 


BL00479 


Phorbol esters / diacylglycerol binding 
domain proteins. 


BL00479B 12.57 4.273e-15 287-303 
BL00479A 19.86 2.667e-14 261-284 

PT fW4"7Qn 19 ^7 1 7^0j> 1A 7^1 *XfH 


400 


PR00171 


CLASS 111 CYTOCHROME C 

OT/^XT A T1 TDX3 

MCjNAIUKc 


PR00171D 7.30 9.419e-10 334-342 


400 


DLUUv 1 0 


PT*«lianH f^lciiitTi-hinriiTip domain 
proteins. 


BL00018 7 41 3 348e-09 223-236 


400 


PF00781 


Diacylglycerol kinase catalytic domain 
proteins (presumed). 


PF00781F 16.43 1.000e-40 600-199 
PF00781B 12.07 8364e-35 454-486 
PF00781D 11.11 3.O77e-30 532-118 
PF00781C 9.69 5.034e-19 506-521 
PF00781E 12.45 2.385e-17 124-583 
PF00781G 10.09 6.211e-17 678-692 
PF00781H 1220 1.750e-16 770-782 
PF00781A 6.42 3.667e-09 354-360 


401 


PR00049 


WILMS TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.407e^)9 325-340 


402 


DM01117 


2 kw TRANSPOSASE WITHIN 


DM01117A 11.17 7.750e-09 364-382 
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SEQBD 
NO: 


Database 
entry ID 


Description 


♦Results 






TRANSPOSITION VASOTOCIN. 




403 


DM01206 


CORONA VIRUS NUCLEOCAPSID 
PROTEIN. 


DM0120® 10.69 9.286e-12 724-744 
DM01206B 10.69 3.466e-10 726-746 
DM01206B 10.69 9.630e-10 722-742 
DM01206B 10.69 7.152e-09 718-738 
DM01206B 10.69 8.861e-09 728-748 


403 


BL00048 


Protamine PI proteins. 


BL00048 6.39 4.197e-10 722-749 BL00048 
639 5.500e-10 731-758 BL00048 6.39 
6.329e-10 729-756 BL00048 6.399.171e-10 
730-757 BL00048 6.39 4.038e-09 728-755 
BL00048 6 J9 8.538e-09 724-751 BL00048 
6.39 9.438e-09 716-743 


403 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9.97 9.690e-09 130-144 


404 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 1.353e-27 31-70 


404 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 5.154e-15 274-287 PD00066 
13.92 7.600e-14 246-259 PD00066 13.92 
8.200e-14 302-315 PD00066 13.92 3.143e- 
12 218-231 PD00066 13.92 4.000e-12 190- 
203 PD00066 13.92 2.800e-09 330-343 


404 




7mc fin«*r tvne domain 

proteins. 


BL00028 16.07 7.261e-12 230-247 BL00028 
16.07 9.171e-12 342-359 BL00028 16.07 
4.300e-10 314-331 BL00028 16.07 7.000e- 
10 174-191 BL00028 16.07 3.3 14e-09 202- 
219 BL00028 16.07 6.400e-09 286-303 


404 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 4.214e-13 339-353 j 
PR00048A 10.52 3.209e-12 227-241 
PR00048A 10.52 1.947e-ll 311-325 
PR00048A 10.52 4.522e-10 171-185 
PR00048B 6.02 2.895e-09 299-309 
PR00048A 10.52 4.600e-09 199-213 
PR00048B 6.02 1.000e-08 187-197 
PR00048B 6.02 1.000e-08 271-281 


406 


BL00610 


Sodiummeurotransmitter symporter 
famil v T»roteins 


BL00610A 17.73 1.000e-40 68-118 
BL00610B 23.65 1.000e-40 132-182 
BL00610C 12.94 l.OOOe-40 225-277 
BL00610D 2057 1.000e-40 291-344 
BL00610F 29.02 6.143e-36 540-157 
BL00610E 20.34 3.209e-35 448-491 
BL00610G 12.89 2.200e-15 173-196 


406 


PR00176 


SODIUM/NEUROTRANSMTTTER 
SYMPORTER SIGNATURE 


PR00176C 10.84 6.226e-23 141-168 
PR00176A 16.82 1.450e-22 68-90 PR00176F 
10.73 8.667e-20 452-472 PR00176B7.31 
7.000e-18 97-117 PR00176D 9.02 1.000e-17 
252-270 PR00176E 11.41 2.756e-15 334-355 
PR00176H 15.27 7.353e-15 131-590 
PR00176G 12.48 5.615e-14 529-112 


407 


DM00179 


w KINASE ALPHA ADHESION T- 
CELL. 


DM00179 13.97 5.304e-09 111-121 
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nnfrv 111 
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♦Results 


408 


PR00187 


DNAJ PROTEIN FAMILY 


PR00187B 13.48 1.800e- 16 45-66 
PR00187A 12.84 6 700e-12 15-35 


408 


BL00198 


Nt-dnaJ domain proteins. 


BL00198B 15.11 9.217e-15 45^66 
BL00198A 8.07 2.459e-ll 19-36 


409 


PR00927 


ADENINE NUCLEOTIDE 
TRANSLOCATOR 1 SIGNATURE 


PR00927E 14.93 4.136e-ll 246-268 


409 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 5.787e-ll 108-133 
BL00215B 10.44 6.211e-ll 258-271 
BL00215A 15.82 5.018e-09 21 1-236 


409 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926D 10.53 5.355e-09 19-38 


410 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 6.400e-17 411-424 PD00066 
13.92 8.200e-17 327-340 PD00066 13.92 








5.154e-15 271-284 PD00066 13.92 2.800e- 
14 215-228 PD00066 13.92 9.000e-13 355- 
368 PD00066 13.92 6.1 43e-12 439-452 
PD00066 13.92 6.478e-ll 187-200 PD00066 
13.92 9.217e-l 1243-256 


410 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 2.588e-14 227-244 BL00028 
16.07 6.824e-14 395-412 BL00028 16.07 
7.882e-14 171-188 BL00028 16.07 2.350e- 
13 339-356 BL00028 16.07 7.300e-13 283- 
300 BL00028 16.07 7.300e-13 367-384 
BL00028 16.07 2.565e-12 423-440 BL00028 
16.07 7.261e-12 199-216 BL00028 16.07 
7.261e-12 311-328 BL00028 16.07 8.435e- 
12 451-468 BL00028 16.07 2.038e-l 1 255- 
272 BL00028 16.07 9.400e-10 143-160 


410 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 3.250e-14 280-294 
PR00048A 10,52 8.500e-14 336-350 

Tvc%f\t\f\AQ k ia ci n >nn a 11 oci ice 

PR00048A 10.52 7.429e-13 252-2oo 
PR00048A 10.52 8.714e-13 448-462 

rKU0U45A lU.DZ /e-li 3yz-*HJO 
PR00048A 10.52 1.000e-12 168-182 

"DPAAfMRA 1ft <*5 0 ftCQo 19 Alh-AIA 

PttrfWMRR A ft? 8 /tl^A-l 1 dftRJLIR 
riS\A)\JHOD O.Ux O.OljC-11 HvO-*rlO 

PR00048B 6.02 7.188e-10 268-278 
PR00048B 6.02 7.188e-10 380-390 ! 
PR00048B 6.02 9.438e-10 296-306 
PR00048B 6.02 1.000e-09 324-334 
PR00048B 6.02 1.474e-09 352-362 
PR00048B 6.02 3.842e-09 212-222 
PR00048B 6.02 5.263e-09 436-446 


411 


BL00018 


EF-hand calcium-binding domain 
proteins. 


BL00018 7.41 5.500e-10 63-76 


413 


PR00014 


FTBRONECTIN TYPE III REPEAT 
SIGNATURE 


PR00014C 15.44 4.600e-10 73-92 


414 


PR00806 


VINCULIN SIGNATURE 


PR00806A 6.63 1.493e-09 785-796 


414 


PR00048 


C2H2-TYPE ZINC FINGER 


PR00048A 10.52 4.240e-09 41-55 
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entry ID 


jjcsci ipuun 


*© pen Its 






<5 TOM ATI TOP 




414 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.546e-l 1 781-796 j 
PR00049D 0.00 1.205e-10 263-278 
PR00049D 0.00 4.356e-09 785-800 


A 1 A 

414 


tot aaai 7 


Neuromodulin (GAP-43) proteins. 


ftfMI 7FI A7^*»_A0 A9H-A71 
OLMKJHlZLf 10.O4 4.0/Je-UlJ 4ZU-H/ I 


414 


BL00422 


Granins proteins. 


BL00422C 16.18 6.3 18e-U 439-467 

Til AA/17'}^ , U 160 OAOa 1 A AAf\-AfSt 
DLUU4ZZvs 10.1 o y.owe-lU 44U-406 

BL00422C 16,18 6.294e-09 441-469 

t*T (\ClATyC* 1£ 1 R £ 7AQa_AQ /<1Q A££L 
DL\JV*+££Ks 10.15 Ou.\)yG-\)y 4.5 0-400 


414 


PR00910 


LUTEOVIRUS ORF6 PROTEIN 
MON A 1 UKJb 


PR00910A 2.51 8.179e-09 265-278 


414 


DM00215 


PROUNE-RICH PROTEIN 3. 


DM00215 19.43 4.203e-09 770-803 

TYK4AAT1 CIO A1 Q AOC a AO O70 

JDM0U213 iy.4J y.Oooe-UJJ 243-2 /o 


414 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 1.257e-09 44-61 BL00028 

1 £. A*7 1 CAIr*. AO 1 *7C 1 Ol DT AAAIO 1 £ AT 

16.07 2.343e-09 175-192 BLUUU/o 1O.07 
6.143e-09 119-136 BL00028 16.07 9.743e- 

AO 1 A1 1 £4 

0> 14 /-1 04 


415 


PF00622 


Receptor Domain in SPla and the 
RYanodine 


PF00622B 21.00 1.000e-13 331-353 


415 


BL00518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


BL00518 12.23 3.400e-ll 3M0 


416 


PFO0780 


Domain found in NDC1 -like kinases, 
mouse citron and yeast ROM. 


PF00780B 23.03 5.929e-33 442^85 


410 


PP AA1 AO 


1 I JK-UoUNJtl IsJJNAoxl wVIAJLi 1 L\s 

DOMAIN SIGNATURE 


PPAA1 AGI) 17 77 < 71Ci» 17 71 1 71A 




pjj nmn.7 

BUJvlV/ 


rToiein Kinases a i r -Dinaing region 

T\TT\t& inc 


PT AA1 A7 AIR 5 7AAo 77 7 1 1 747 
OLUU1U//1. lo.35* 3.ZUUe-ZZ Zii-ZAZ 

PJjftO,107ft n^l Q inR#07 7R1-7QQ 


416 


BL00239 


Receptor tyrosine kinase class II 


BL00239B 25.15 5.164e-10 145-193 


416 


BL00915 


Phospharidylinositol 3- and4-kinases 


BL00915C 22.43 9.357e-10 203-242 * 


417 


BL00021 


Kringle domain proteins. 


BL00021B 13.33 1.482e-14 41-59 
RL00021D 24 56 2 122e-12 193-235 


417 


PR00722 


CHYMOTRYPSIN SERINE j 

PROTPAQp FAMTTV 

SIGNATURE 


PR00722A 12.27 7.517e-14 42-58 
PRG0722R 17 ^1 1 14^<»-1 0,97-1 12 


417 


BL00134 


Serine proteases, trypsin family, 
histidine proteins. 


BL00134A 11.96 6.464e-16 41-58 
BL00134C 13.45 2.059e-09 221-235 


417 


BL00495 


Apple domain proteins. 


BL00495O 13.75 2.440e-09 212-241 


417 


BL00672 


Serine proteases, V8 family, histidine 
proteins. 


BL00672A 9.79 9.520e-09 41-57 


417 


PR00839 


V8 SERINE PROTEASE FAMILY 
SIGNATURE 


PR00839B 1120 9.753e-09 41-59 


418 


BL01207 


Glypicans proteins. 


BL01207B 23.69 9.122e-28 191-237 
BL01207A 12.21 1.000e-16 62-78 


423 


PD02870 


RECEPTOR INTERLEUKIN-1 
PRECURSOR. 


PD02870D 15.74 4.351e-09 693-728 


423 


DM00179 


w KINASE ALPHA ADHESION T- 
CELL. 


DM00179 13.97 5.696e-09 793-803 
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'Results 


424 


BL00203 


Vertebrate metallothioneins proteins. 


BL00203 13.94 5.041e-09 13-59 






r roiein Kindbca j\ i jr-uiiiuiiig icgiuu 
proteins. 


BI.00107A 18 39 8 141e-18 217-248 


425 


BL00240 


Receptor tyrosine kinase class IE 
proteins. 


BL00240E 1LS6 6.040e-10 203-241 


425 


t>t> f\(\\ no 


1 I KAJoIINE, ISJlNAoil V/A1ALI Ll\* 

DOMAIN SIGNATURE 


PPftftlftQR 12 97 ^ 814p-14 917-936 
PR00109A 15.00 1.730e-09 182-196 


428 


FR00141 


rKU 1 cAoUJVLb lUMr UJN11IN 1 

SIGNATURE 


PR00141D 12.45 8.615e-12 259-271 
ppnni air 1 1 i ^ Q ^£1 1 9 991 91^ 

PR00141A11.362.050e-ll 102-118 


428 


BL00854 


Proteasome B-type subunits proteins. 


RTAAfi^AA 17 Ol 1 7ft7f» 1QQQ-14S 

BL00854C 29.92 5.235e-14 206-235 

RT AAS<vAn ROlV-fiQ 957-967 


429 


PR00245 


OLFACTORY RECEPTOR 

CI/TM A TT TDT7 


PR00245A 18.03 9.413e-17 59-81 

PPAA9A5P 7 8 A 7 SfMV_1 6 93R-9S4 

PR00245E 12.40 2.500e-12 291-306 
PR00245B 10 38 9 112e-ll 177-192 


429 


PR00237 


RHODOPSIN-LIKE GPCR 

QTTPFPFAAATT V ^TfiNATTTRF 
o ur j^ivr/TLiviLi_» i oi\ji>/v i uivl 


PR00237E 13.03 7.120e-12 199-223 
PR00237C 15 69 1 225e-09 104-127 


429 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 9.727e-14 90-130 
BL00237D 11 23 1 273e-09 282-299 


429 


PR00534 


MELANOCORTIN RECEPTOR 


PR00534A 11.49 6.40Ge-09 51-64 


430 


PF00651 


BTB (also known as BR-C/Ttk) domain 

(JlUlC-lllJ. 


PF00651 15.00 1.000e-ll 87-100 


430 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 4.706e-14 474-491 BL00028 
16.07 1.771e-09 502-519 






PPHTPTN 7 TNP-FTMGF'R MFTATv- 

X SSXJ 1 12,11 N ZulxN X XVS KJLZ XV IVU-j 1 /Vl>~ 

BINDI. 


PD00066 13 92 4 300e-O9 490-503 


430 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 4.600e-09 499-513 


433 


BL00086 


Cytochrome P450 cysteine heme-iron 
ligand proteins. 


BL00086 20.87 3.209e-23 430-462 


All 

433 




c at a oo pa^a nvni TP TV i 

QTYTWATTTOP 
olVJlNAl UivD 




433 


PR00359 


B-CLASS P450 SIGNATURE 


PR00359G 11.22 8.071e-10 401-417 
PR00359F 24.20 2.180e-09 373-401 


433 


PR00385 


P450 SUPERFAMILY SIGNATURE 


PR00385E 12.66 8.800e-ll 440-452 
PR00385D 13.11 4.429e-l0 431-441 
PR00385A 14.97 5.865e-09 302-320 


433 


PR00464 


E-CLASS P450 GROUP H 
SIGNATURE 


PR00464G 12.41 9.000e- 10 405421 
PR00464D 17.40 1.191e-09 320-338 
PR00464E 18.28 6.946e-09 349-370 
PR00464H 13.32 7.750e-09 427441 
PR00464C 18.84 9.014e-09 291-320 
PR004641 14.64 9.481e-09 440-464 


434 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 7.943e-19 101-151 


434 


PR00171 


SUGAR TRANSPORTER 
SIGNATURE 


PR00171D 12.76 3.593e-ll 413-435 
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435 


PR00049 


WILMS TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.429e-10 10-25 


435 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 4.150e-13 138-593 BL00028 
16.07 6.850e-13 1010-1027 BL00028 16.07 
6.087e-12 982-999 BL00028 16.07 8.615e- 
11 846-863 BL00028 16.07 3.100e-10 317- 
334 BL00028 16.07 7.000e-10 170-187 
BL00028 16.07 8.500e-10 289-306 BL00028 
16.07 8.800e-10 548-565 


435 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 7.600e-14 998-1011 
PD00066 13.92 l.OOOe-11 305-318 PD00066 
13.92 8.826e-ll 564-577 PD00066 13.92 
3.400e-09 862-875 


435 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 5.329e-09 177-192 
PR00456E 3.06 5.899e-09 140-155 


435 


BL00999 


Streptomyces subtihsin-type inhibitors 
proteins. 


BL00999A 14,95 7.223e-09 461-499 


435 


PR00048 


C2H2-TYPE ZINC FINGER 

QTfJXJ A TT TP PT 


PR00048A 10.52 9.357e-13 573-587 

i IVVUI>*tO/\ lU.J^ ^.*r^lC-ll 1\J\J f - L\J£i l 

PR00Q48B 6 02 2 125e-10 561-133 
PR00048A 10.52 8.043e-10 314-328 
PR00048B 6.02 1.000e-09 995-1005 
PR00048B 6.02 6.684e-09 302-312 
PR00048A 10.52 9.280e-09 167-181 




r J\xJ\J£,*rJ 


OT FACTORY RFCFPTOP 
SIGNATURE 


PR00245A 18 03 2 667e-23 100-122 
PR00245C 7.84 1.783e-14 232-248 
PR00245D 10.47 7.070e-10 268-280 


436 


PR00237 


RHODOPSIN-LIKE GPCR 


PR00237C 15.69 8.500e-ll 145-168 






SUPERFAMELY SIGNATURE 


PR00237G 19.63 6.023e-09 266-293 


436 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 2.161e-15 131-171 
BL00237D 11.23 8.091e-09 276-293 


417 


PR00262 


IL1/HBGF FAMILY SIGNATURE 


PR00262A 28.26 1.000e-08 80-108 


438 


BL00884 


Osteopontin proteins. 


BL00884B 12.47 1.000^4050-94 
BL00884C 22.45 6.187e-39 131-173 
BL00884A 11.35 5.846e-32 1-31 BL00884E 
11.04 8.364e-23 273-295 BL00884D8.79 
3.323e-18 255-272 


438 


PR00216 


OSTEOPONTIN SIGNATURE 


PR00216B 7.89 4.553e-34 37-67 PR00216A 
10.94 8.054e-33 2-32 PR00216C 9.63 '< 
2.565e-32 67-93 PR00216G 12.39 8.676e-27 
238-264 PR00216H 7.41 5295e-22 273-293 
PR00216F 11.79 3.133e-21 164-183 
PR00216D 2.74 5.800e-18 104-1 19 
PR00216E 8.44 4.405e-16 132-147 
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SEQ 
ID 


Pfam Model 


Description 


E-value 


Score 


No: of 

Pfam 

Domains 


Position of 
the Domain 


i 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3- 
Htype 


1.8e-05 


31.6 


1 


412-438 


1 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


2e-05 


21.8 


1 


14-52 


3 


EMP24_GP25L 


emp24/gp25L/p24 family 


4.1e-105 


362.6 


1 


22-235 


6 


WW 


WW domain 


12e-05 


32.2 


1 


45-75 


7 


WW 


WW domain 


1.2e-05 


32.2 


1 


45-75 


8 


Aa_trans 


Transmembrane amino acid 
transporter protein 


9.6e-64 


225.2 


1 


7M51 


9 


Fe-ADH 


Iron-containing alcohol 
dehydrogenase 


9.9e-35 


1243 . 


2 


4-205228- 
255 


10 


Fe-ADH 


Iron-containing alcohol 
dehydrogenase 


9.9e-35 


124.5 


2 


52-253276- 
303 


11 


Bcl-2 


Apoptosis regulator proteins, 
Bcl-2 family 


0.016 


-2.1 


1 


257-356 


12 


spectrin 


Spectrin repeat 


1.3e-10 


43.6 


3 


11-87:90- 
197200-291 


13 


RibosomaLL 1 8 ae 


Ribosomal L18ae protein 
family 


i.9e-128 


440.1 


1 


6-176 


14 


RibosomaLL31e 


Ribosomal protein L31e 


2.4e-47 


170.7 


1 


72-166 


15 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3- 
Htype 


7.8e-16 


66.0 


3 


342-367:371- 
396:398-420 


16 


zf-MYND 


MYND finger 


1.4e-13 


58.5 


1 


52-90 


17 


Sterile 


Male sterility protein 


Lle-51 


185.1 


1 


254-446 


18 


MgtE 


Divalent cation transporter 


8.6e-39 


142.3 


2 


138-274:352- 
499 


19 


Rap_GAP 


Rap/ran-GAP 


2e-124 


426.7 


1 


400-588 


19 


PDZ 


PDZ domain (Also known as 
DHR or GLGF) 


2.4e-06 


343 


1 


726-800 


20 


Rap_GAP 


Rap/ran-GAP 


2e-124 


426.7 


1 


400-588 


20 


PDZ 


PDZ domain (Also known as 
DHR or GLGF) 


2.4e-06 


343 


1 


726-800 


22 


SCAN 


SCAN domain 


13e-23 


91.7 


1 


165-238 


23 


RhoGAP 


RhoGAP domain 


3e-58 


206.9 


1 


497-649 


23 


FCH 


Fes/CIP4 homology domain 


1.2e-18 


75.4 


1 


22-121 


23 


SH3 


SH3 domain 


2.6e-ll 


51.0 


1 


723-777 


24 


adh_zinc 


Zinc-binding dehydrogenases 


1 C A AC 


-25 .4 


i 
1 


Z0-33O 


25 


UDPGT 


UDP-glucoronosyl and UDP- 
glucosyl transferas 


1.6e-84 


294.3 


1 


26467 


28 


Ribosomal JL6e 


Ribosomal protein L6e 


4.3e-77 


2693 


1 


109-239 


29 


RibosomaLLll 


Ribosomal protein LI 1 


4.9e-64 


2262 


1 


13-144 


30 


tRNA-synt_le 


tRNA synthetases class I (C) 


1.6e-137 


4702 


1 


64-538 


32 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


0.00041 


17.6 


2 


33-72:165- 
185 


34 


ras 


Ras family 


1.4e-77 


2712 


1 


35-235 


34 


arf 


ADP-ribosylation factor 
family 


9.3e-05 


-56.3 


1 


17-198 


36 


SET 


SET domain 


3.2e-05 


10.0 


1 


209-342 


36 


MORN 


MORN repeat 


0.006 


232 


3 


36-5839- 
81:106-128 
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37 


laminin_G 


Laminin G domain 


L5e-ll 


44.7 




55-174 


37 


EGF 


EGF-like domain 


0.0033 


24.1 


J 


202-234 


38 


Sema 


Sema domain 


L7e-127 


436.9 


_J 


c£ >4 on 

56*489 


38 


Plexin_repeat 


Plexin repeat 


le-06 


35.7 




507-563 


38 




Immunoglobulin domain 


0.0023 


15.9 




582-639 


38 


integrinjJ 


Integrins, beta chain 


0.084 


6.1 




513-527 


40 


filament 


Intermediate filament protein 


1.6e-138 


473.6 


-! 1 


129-442 


41 


KeratinJB2 


Keratin, high sulfur B2 
protein 


1.8e-18 


74.8 


2 


2-138:139- 
240 


44 


sushi 


Sushi domain (SCR repeat) 


3.8e-06 


33.9 


4 


1396- 

1459:1464- 

1521:1525- 

1590:1595- 

1646 


45 


profiiin 


Profiiin 


4.1e-13 


51.7 


1 


10-124 


47 


ubiquitin 


Ubiquitin family 


0.00033 


20.5 


1 


31-99 


48 


BTB 


BTB/POZ domain 


2.6e-21 


842 


1 


80-196 


48 


Kelch 


Kelch motif 


2.6e-20 


80.9 


4 


336-382:384- 
430:432- 

A 'Id .COI /TIC 

478382-635 


48 


SCP 


SCP-hke extracellular protem 


0.015 


1 1 A 

13.0 


1 


1-35 


49 


serpin 


Serpin (serine protease 
inhibitor) 


2.4e-178 


605.4 


1 


59-432 


50 


T-box 


T-box 


3.6e-125 


429.2 


1 


i /in iai 


52 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


1.2e-17 


58.3 


2 


1 32-228:3 J /- 


53 


CSD 


•Cold-shock* DNA-binding 
domain 


1.8e-16 


63.6 


1 


42-112 


53 


zr-CCHC 


Zinc knuckle 




OR R 
Zo.o 




1T7 1^4*1^0- 
ij/-ij < t. toy- 

176 


C A 

54 




Immunoglobulin domain 


/oe-u/ 


OR O 


1 
1 


oh inn 


55 


Rap__GAP 


Rap/ran-GAP 


5e-18 


73.3 


1 


287-466 


57 


G-gamma 


GGL domain 


1 Co 1 1 


1Q A 


2 


/lO 7fV1flQ 


58 


T-box 


I -box 


O (la 11/1 




1 


lUlOUZ 


59 


Gag^plO 


Retroviral GAG p 10 protein 


9.2e-06 


23.7 


l 


82-171 


61 


60s ribosomal 


60s Acidic ribosomal protein 


0.0089 


12.0 


1 


1-22 


62 


UPARJ.Y6 


u-PAR/Ly-6 domain 


5.4e-05 


223 


1 


8-51 


63 


Ribosomal JL30 


Ribosomal protein L30p/L7e 


0.00042 


toe 

18.5 


l 


65-93 


HA 

54 


filament 


Intermediate filament protein 


1 1/» 7R 


074 R 


o 


426 


65 


RibosomaLS6 


Ribosomal protein S6 


0.00082 


15 


1 


2-96 


66 


PDZ 


PDZ domain (Also known as 
DHR or GLGF) 


5.1e-09 


43.4 


1 


158-250 


67 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


0.005 


14.0 


1 


92-118 


68 


G-patch 


G-patch domain 


6.8e-07 


363 


1 


26-70 


69 


Kexatin_B2 


Keratin, high sulfur B2 
protein 


0.037 


-45.9 


1 


10-155 


83 


ig 


Immunoglobulin domain 


8.5e-09 


33.4 


2 


34-89:119- 
187 


86 


zf-C2H2 


Zinc finger, C2H2 type 


2.2e-71 


250.6 


17 


182-204:210- 
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232:237- 
260:265- 
288:315- 
337:343- 
365:369- 
392:653- 
675:681- 
704:709- 
733:741- 
764:791- 
814:820- 
842:848- 
870:877- 
899:905- 
928:952-975 


87 


ig 


Immunoglobulin domain 


2.7e-35 


118.7 


6 


36-121:162- 

249:292- 

375:422- 

517:564- 

657:704-795 


88 


MAP1_LC3 


Microtubule associated 
protein 1A/1B, light 


9.4e-79 


275.0 


1 


118-221 


89 


WD40 


WD domain, G-beta repeat 


1.6e-12 


55.1 


4 


173-215:221- 
263:269- 
305:1103- 
1140 


90 


FKBP 


FKBP-type pepudyl-prolyl 
cis-trans isomeras 


l^e-59 


198.9 


1 


66-160 


92 


RPEL 


RPEL repeat 


6.5e-18 


73.0 


2 


513-538:551- 
576 


93 


transkeLpyr 


Transketolase, pyridine 
binding domain 


4.6e-65 


229.6 


1 


568-773 


93 


El_dehydrog 


Dehydrogenase El 
component 


8.7e-23 


89.1 


1 


193-504 


95 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


8.7e-09 


32.7 . 


1 


595-635 


97 


tg 


Immunoglobulin domain 


1.8e-20 


71.0 


3 


31-88:127- 
185:222-278 


98 


ig 


Immunoglobulin domain 


1.8e-20 


71.0 


3 


24-81:120- 
178:215-271 


99 


Patched 


Patched family 


6.2e-06 


-369.1 


1 


66-935 


102 


zf-C2H2 


Zinc finger, C2H2 type 


2.3e-94 


326.9 


12 


209-231:237- 

259:265- 

287:293- 

315:321- 

343:349- 

371:377- 

399:405- 

427:433- 

455:461- 

483:489- 

511:594-616 
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102 


KRAB 


KRAB box 


3.7e-37 


136.9 


1 


15-77 


103 


zf-C2H2 


Zinc finger, C2H2 type 


l^e-55 


198.2 


9 


172-195:271- 

293:299- 

321:327- 

349:355- 

377:383- 

405:411- 

433:439- 

461:467-489 


103 


KRAB 


KRAB box 


3e-46 


167.1 


1 


8-70 


107 


zf-CCHC 


Zinc knuckle 


24e-16 


67.8 


3 


913- 

930:1293- 

1310:1358- 

1375 


107 


NTPjransL? 


Nucleotidyltransferase 
domain 


4.4e-ll 


50.3 


1 


972-1065 


108 


zf-C2H2 


Zinc finger, C2H2 type 


1.6e-42 


154.7 


5 


283:289- 
311:317- 
339:345- 
367:373-395 


109 


myosin jhead 


Myosin head (motor domain) 


0 


1267.5 


1 


26-697 


109 


IQ 


IQ calmodulin-binding motif 


1.2e-17 


72.1 


4 


714-734:737- 

757:760- 

780:789-809 


110 


pkinase 


Protein kinase domain 


l^e-96 


3345 


1 


20-271 


111 


WD40 


WD domain, G-beta repeat 


1.8e-49 


177.8 


8 


161-197:218- 

253:258- 

294:300- 

335:341- 

377:383- 

428:434- 

470:476-511 


112 


SNF2_N 


SNF2 and others N-tenninal 
domain 


4.2e-78 


272.9 


1 


1-264 


112 


helicase_C 


Helicase conserved Cr 
terminal domain 


1.2e-24 


95.4 


1 


326-410 


113 


DUF15 


Domain of unknown function 
DUF15 


0.00064 


-60.4 


1 


132-384 


114 


DSPc 


Dual specificity phosphatase, 
catalytic 


0.0004 


-2.9 


1 


141-295 


114 


Y_phosphatase 


Protein-tyrosine phosphatase 


0.0037 


-26.9 




128-295 


115 


Ulpl_C 


Ulpl protease family, C- 
terminal catalytic d 


2.8e-52 


187.1 




394-587 


117 


Rhodanese 


Rhodanese-like domain 


le-05 


32.4 




160-260 


119 


ABC1 


ABC1 family 


1.7e-40 


147.9 




318-434 


122 


proteasome 


Proteasome A-type and B- 
type 


7.4e-43 


155.8 




39-146 


124 


Ribosomal_L9 


Ribosomal protein L9 


3.1e-05 


-3.4 




94-240 


125 


RIOl 


RIO1/ZK632.3/MJ0444 
family 


7.8e-80 


278.6 




193-387 


128 


abhydrolase 


alpha/beta hydrolase fold 


4.5e~20 


80.1 


1 


121-364 


129 


TPR 


TPR Domain 


4.8e-27 


103.3 


7 


355-388:473- 
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Pfam 
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Domains 
















506-307- | 














540:654- 














687:688- 














721:722- 














755:756-789 


130 > 


HMG14 17 


HMG14 and HMG17 


1.9e-15 


64.7 


1 


2-73 


131 


bZIP 


bZDP transcription 


8.3e-19 


71.7 


1 


288-352 


132 


mn 


RNA recognition motif. 


1.9e-31 


117.9 


3 


432-502:546- 














616:858-929 


133 


AMP-binding 


AMP-binding enzyme 


7.1e-117 


401.7 


1 


142-580 


138 


tubulin 


Tubulin/FtsZ family 


2.1e-151 


516.4 


1 


1-223 


141 


lamininJEGF 


Laminin EGF-like (Domains 


7.6e-12 


52.8 


4 


252-297:300- 






III and V) 








348:1342- 














1391:1469- 














1530 


141 


Kelch 


Kelch motif 


1.6e-05 


31.8 


4 


654-702:760- 














811:873- • 














918:929-990 


141 


integrinJB 


Integrins, beta chain 


0.0061 


9.4 


3 


44-59:100- 














117:1019- 














1028 


141 


EGF 


EGF-like domain 


0.092 


19.3 


8 


167-203:207- 














235:297- 














331:496- 














533:538- 














569:1271- 














1308:1312- 














1338:1478- 














1508 


142 


T3TTXT 
RUN 


RUN domain 


8e-44 


159.0 


1 


31-163 


142 


FYVE 


FYVE zinc finger 


2.3e-29 


109.1. 


1 


529-593 


143 


zf-C2H2 


Zinc finger, C2H2 type 


1.7e-33 


124.7 


5 


442-464:505- 














527:533- 














555:561- 














583:589-611 


143 


BTB 


BTB/POZ domain 


1.6e-22 


88.2 


1 


30-143 


144 


mito_carr 


Mitochondrial carrier protein 


3.6e-61 


216.6 


3 


10-158:160- 














250:254-354 


146 


DAGKc 


Diacylglycerol kinase 


0.00015 


26.0 


1 


157-303 






catalytic aomain 










147 


Exonuclease 


Exonuclease 


1.6e^41 


151.4 


1 


228-384 


147 


mn 


RNA recognition motif. 


9.5e-08 


39.2 


2 


507-574:602- 














674 


151 


WH2 


WH2 motif 


6.5e-20 


79.6 


3 


1194- 














1214:1234- 














1254:1322- 














1342 


154 


DHDPS 


Dihydrodipicolinate 


9.1e-21 


82.4 


1 


3-270 






synthetase family 










156 


PseudoU_synth_l 


tRNA pseudouridine synthase 


le-30 


115.4 


1 


111-322 


157 


pkinase 


Protein kinase domain 


2.3e-59 


210.6 


1 


216-512 


158 


ubiquitin 


Ubiquitin family 


2.4e-05 i 


24.6 


1 


3-79 
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160 


IF-2B 


Initiation factor 2 subunit 
family 


1.7e-98 


340.7 


1 


157-475 


161 


Beach 


Beige/BEACH domain 


l.le-224 


759.8 


1 


1470-1747 


161 


WD40 


WD domain, G-beta repeat 


2.9e-08 


40.9 


5 


1848- 

1882:1888- 
1928:1947- 
1983:2030- 
2064:2071- 
2107 


164 


DnaJ 


DnaJ domain 


1.9e-16 


68.1 


1 


125-189 


165 


AntLproliferat 


BTG1 family 


7.4e-85 


295.3 


1 


1M64 


166 


sugarjr 


Sugar (and other) transporter 


1.2e-78 


274.7 


1 


34-548 


167 


sugarjr 


Sugar (and other) transporter 


7e-52 


185.8 


1 


34-180 


168 


zf-C2H2 


Zinc finger, C2H2 type 


1.7e-93 


324.0 


13 


222-244:250- 

272:278- 

300:306- 

328:334- 

356:362- 

384:390- 

412:418- 

440:446- 

468:474- 

496:502- 

524:530- 

552:558-580 


168 


KRAB 


KRAB box 


1.8e-35 


131.2 


1 


57-119 


169 


GBP 


Guanylate-binding protein, 
N-terminal domain 


le-191 


636.2 


1 


1-275 


169 


GBP_C 


Guanylate-b inding protein, C- 
terminal domain 


6.6e-162 


551.3 


1 


277-573 


170 


cyclin 


Cyclin, N-terminal domain 


0.0022 


9.3 


1 


48-192 


171 


TPR 


TPR Domain 


9.7e-43 


155.4 


6 


133-166:167- 

200:201- 

234:282- 

315:316- 

349:350-383 


173 


RhoGEF 


RhoGEF domain 


3.3e-40 


1 At A 

147.0 


1 


100-345 


173 


PH 


PH domain 


6.5e-14 


CA C 

54.5 


1 


Q*70 AQ1 


173 


SH3 


SH3 domain 


l.le-10 


48.9 


1 


72-126 


174 


_r r"»oTjr*M 
ZI-C3rltJ4 


ZAnc linger, uJriL-4 type 
(RING finger) 


U.UUUl I 


10 A 




lO'JJ 


174 


GBP_C 


Guanylate-binding protein, C- 
tenninal domain 


0.016 


12.1 


1 


86-114 


175 


Peptidase3122 


Glycoprotease family 


2.3e-73 


257.2 


1 


1-324 


177 


TBC 


TBC domain 


4.7e-08 


10.1 


1 


57-268 


178 


transmembran©4 


Tetraspanin family 


1.6e-78 


259.2 


1 


16-261 


179 


CH 


Calponin homology (CH) 
domain 


1.2e-25 


98.6 


1 


24-133 


179 


calponin 


Calponin family repeat 


1.7e-14 


51.8 


1 


174-199 


182 


AP_endonucleasl 


AP endonuclease family 1 


2.6e-17 


59.4 


2 


1-36:50-135 


184 


BacteriaLPQQ 


PQQ enzyme repeat 


9.3e-05 


29.2 


2 


52-89:534- 
571 
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185 


DEAD 


DEAD/DEAH box helicase 


L6e-60 


1943 


1 


216420 


185 


helicase_C 


Helicase conserved C- 
terminaJ domain 


5.9e-25 


96.3 


1 


454-540 


186 


zf-C2H2 


Zinc finger, C2H2 type 


3.2e-24 


93.9 


6 


106-128:134- 

156:162- 

184:195- 

218:477- 

499:505-529 


187 


sugar_tr 


Sugar (and other) transporter 


0.0014 


-90.1 


1 


272-672 


188 


tRNAJnt_endo 


tRNA intron endonuclease, 
catalytic C-t 


0.0025 


-7.7 


1 


73-159 


189 


wsc 


WSC domain 


le-35 


132.1 


1 


175-254 


189 


Sulfa transfer 


Sulfotransferase protein 


4e-34 


126.8 


1 


356-586 


191 


pkinase 


Protein kinase domain 


5.1e-75 


262.6 


1 


148-421 


191 


PDZ 


PDZ domain (Also known as 
DHRorGLGF) 


1.3e-05 


32.1 


1 


740-827 


193 


globin 


Globin 


1.9e-26 


96.6 


1 


3-78 


195 


WD40 


WD domain, G-beta repeat 


6.7e-14 


59.6 


4 


64-108:116- 

153:158- 

194:288-323 


197 


BROl 


BROl-like domain 


0.0042 


-29.4 


1 


9-161 


198 


F_actiDLcap_B 


F-actin capping protein, beta 
subunit 


1.7e-224 


759.2 


1 


1-269 


199 


ank 


Ank repeat 


le-66 


235.0 


8 


40-73:82- 

114:115- 

147:148- 

180:181- 

212:213- 

246:481- 

526:527-559 


203 


PDZ 


PDZ domain (Also known as 
DHRorGLGF) 


4.2e-07 


37.0 


1 


211-293 


204 


SAM 


SAM domain (Sterile alpha 
motif) 


1.2e-ll 


52.1 


1 


5-70 


205 


SAM 


SAM domain (Sterile alpha 
motif) 


l^e-11 


52.1 


1 


5-70 


206 


zf-UBRl 


Putative zinc finger in N- 
recognin 


4.7e-25 


96.7 


1 


978-1046 


207 


ABCjtran 


ABC transporter 


2.4e-112 


386.6 


2 


467- 

647:1536- 
1717 


209 


zf-C2H2 


Zinc finger, C2H2 type 


0.00035 


27.3 


1 


200-225 


210 


UCH-2 


Ubiquitin carboxyl-terminal 
hydrolase family 


1.5e-19 


78.4 


1 


385-154 


211 


IMP4 


Domain of unknown function 


2.2e-33 


124.3 


1 


144-297 


213 


zf-C2H2 


Zinc finger, C2H2 type 


2.9e-08 


40.9 


3 


12-37:173- 
198:208-230 


214 


LysM 


LysM domain 


2.1e-ll 


51.3 


1 


73-116 


215 


ank 


Ank repeat 


l.le-05 


32.3 


2 


834-867:879- 
912 


215 


TIG 


IPT/TIG domain 


0.009 


22.6 


1 


642-723 


217 


pyr_redox 


Pyridine nucleotide- 


1.7e-71 


251.0 


I 


196-470 
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disulphide oxidoreducta 










217 


Rieske 


Rieske [2Fe-2S] domain 


6.2e-20 


79.6 


1 


68-168 


218 


PDZ 


PDZ domain (Also known as 
DHRorGLGF) 


8.5e-19 


75.9 


1 


642-728 


219 


pkinase 


Protein kinase domain 


8.1e-67 


235.4 


1 


26-204 


220 


dsrm 


Double-stranded RNA 
binding motif 


0.095 


15 


1 


100-172 


221 


PHD 


PHD-finger 


5.4e-G5 


29.6 


1 


147-203 


222 


L27 


L27 domain 


6.5e-16 


66.3 


1 


13-68 


222 


SAM 


SAM domain (Sterile alpha 
motrf) 


7.2e-10 


46.2 


2 


1051- 

1117:1166- 
1230 


223 


TRM 


N2J^2-dimethylguanosine 
tRNA methyltransfera 


'7.3e-22 


86.1 


1 


227-693 


224 


LIM 


LIM domain 


5.3e-06 


33.4 


2 


124-180:183- 
243 


225 




Immunoglobulin domain 


l.le-07 


29.8 


1 


55-144 


227 


F-box 


F-box domain 


1.3e-05 


32.1 


1 


11-59 


229 


Glucosamine_iso 


Glucosamine-6-phosphate 
isomerases/6- 


2.7e-158 


539.3 


1 


15-250 


231 


PTNMK 


PTN/MK heparin-binding 
protein family 


3.6V44 


160.2 


1 


51-148 


236 


ion_trans 


Ion transport protein 


1.6e~22 


88.3 


1 


174-393 


238 


GNS1JSUR4 


GNS1/SUR4 family 


5.2e-46 


1663 


1 


10-265 


240 


ubiquitin 


Ubiquitin family 


2.7e-05 


24.4 


1 


10-89 


241 


PIP5K 


Phosphatidylinositol-4- 
phosphate 5-Kinase 


1.5e-155 


530.2 


1 


124-420 


242 


cadherin 

* 


Cadherin domain 


0 


1298.9 


19 


1-75:89- 

180:194- 

290:355- 

434:448- 

549:563- 

652:671- 

774:788- 

881:896- 

988:1002- 

1092:1106- 

1192:1206- 

1295:1309- 

1379:1393- 

1489:1503- 

1594:1608- 

1699:1713- 

1808:1814- 

1910:1922- 

2016 


244 


fn3 


Fibronectin type HI domain 


1.2e-31 


118.6 


4 


58-140:152- 

238:249- 

333:345-426 


245 


UQ_con 


Ubiqmtin-conjuganng 
enzyme 


1.4e-16 


68.5 


1 


93-250 


246 


LRR 


Leucine Rich Repeat 


1.7e-14 


61.6 


6 


51-75:76- 



234 



WO 02/081731 



PCT/US02/01222 



Table 4 



m 




Ijesrrintirtfi 

JL# V-3*,X 1UUUU 


E- value 


Score 


No: of 

Pfam 

Domains 


Position of 
the Domain 












- 


99:155- 
178:181- 
203:204- 
226:227-251 


247 


lipocalln 


Lipocalin / cytosolic fatty- 
acid binding Dr 


1.2e-28 


102.8 


1 


164-294 


248 


RibosomaLS2 


Ribosomal protein S2 


2.9e-ll 


43.7 


1 


33-80 




uiDUim 


Tnhitlin/Pt<i7 fnmilv 


83e-163 


554.2 


1 


1-277 




f~iiV\i il in 

luouiin 


i ul«uii-ii/x icmnjf 


2.4e-212 


718.8 


1 


1-351 


251 


ATP-synt_ab 


ATP synthase alpha/beta 

fnmilv riii/*lfV>t 
LdlUUjf^ UUVlM/i 


l^e-75 


264.8 


1 


138-346 


251 


ATP-synt_ab_C 


ATP synthase alpha/beta 

/*rtatn (~* fprmin 
CLtalli, Id mm 


2.7e-38 


140.6 


1 


348-456 


251 


ATP-synLab_N 


ATP synthase alpha/beta 

•familv Hptn-lvi 
lnUlAljr y ucitt-iw 


5.4e-19 


763 


1 


67-135 


ZDZ 


\'l'l> cirri f gK 
A 1 X -5yilL__iiD 


ATP cvnthflcp nlnha/hptn 

family, nucleot 


13e-70 


248.0 


1 


138-344 


ZjZ 


a l i^-syni_a d_in 


nlr bynuiabC aipiUw DCia 

family, beta-ba 




76.5 




67-135 


ZD J 


ZT-OJrlv>l- 


(RING finger) 


5e-12 


43.2 


1 


39-79 




Vjr-pdlCIl 


f~l-natr*h Hrnnflin 


1.3e-08 


42.1 


1 


410-456 


ZO J 




PnlnrMiin hnmnloov ( PHl 

domain 


L6e-ll 


51.7 


1 


24-134 


256 


RF-1 


Peptidyl-tRNA hydrolase 
domain 


5.9e-66 


2323 


1 


225-338 


057 

Zj 1 


IvT-l 


PpnriHvl-tRNA hvdrnlaxp 

domain 


5.9e-66 


232.5 


1 


189-302 


058 
ZOO 


OTTT 


r^TTT-Iilrp pvctpirw* nrntMCP. 
Ul u~iiA.c cjrSLCiitc |ji u>c<ia& 


4.4e-18 


733 


1 


189-304 | 


Zj? 


thiorod 


ThinrpHrwin 

X UlulCUUAUl 


2e-09 


35.7 


2 


119-165:662- 
695 


260 


myroglobulin_l 


Thyroglobulin type-1 repeat 


3.1e-34 


127.2 


2 


95-158:227- 
292 




MUHU 


TCflzfll-fvne serine protease 
inhibitor 


9.3e-07 


35.9 


1 


43-87 


262 


DnaJ 


DnaJ domain 


4.1e-15 


63.6 


1 


277-338 


ZOj 




wn domain 0-heia reneat 

4 


4e-21 


83.6 


5 


3-42:49- 
86:97- 
133:142- 
178:184-220 


265 


DUF6 


Integral membrane protein 
DUF6 


0.083 


9.1 


2 


81-316:338- 
470 


266 


RibosomaLL31e 


Ribosomal protein L31e 


1.7e-61 


217.7 


1 


15-109 


268 


F5JF8 type C 


F5/8 type C domain 


2.4e-65 


2303 


1 


42-196 


268 


Zn_carbOpept 


Zinc carboxypeptidase 


33e-50 


180.1 


2 


224-341:400- 
600 


270 


BTB 


BTB/POZ domain 


7.7e-18 


72.7 


1 


8-119 


270 


zf-C2H2 


Zinc finger, C2H2 type 


4.2e-13 


57.0 


4 


254-276:363- 

385:390- 

412:448-468 


271 


Glycos_transf_l 


Glycosyl transferases group 1 


0.027 


12.8 


1 


291-385 


272 


HEAT 


HEAT repeat 


2.2e-07 


38.0 


3 


237-275:276- 
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315:674-712 


273 


HEAT 


HEAT repeat 


2.2e-07 


38.0 


3 


237-275:276- 
315:640*78 


275 


SPRY 


SPRY domain 


2.6e-34 


127.4 


1 


390-515 


275 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


le-16 


58.5 


1 


29-69 


277 


BTB 


BTB/POZ domain 


6e-27 


103.0 


1 


36-149 


277 


Kelcb 


Kelch motif 


9.7e-21 


82.3 


4 


331-390:392- 

441:443- 

493:540-586 




Zl-CZriZ 


zjnc ungcr, wni i yF c 


4.1e-116 


399.2 


14 


193-215:221- 

243:249- 

271:277- 

299:305- 

327:333- 

355:361- 

383:389- 

411:417- 

439:445- 

467:473- 

495:501- 

523:529- 

551:557-579 


229 


SCAN 


SCAN domain 


2.4e-52 


187.3 


1 


36-132 


229 


zf-C2H2 


Zinc finger, C2H2 type 


2.4e-51 


184.0 


7 


348-370:375- 

397:403- 

425:431- 

453:459- 

480:486- 

508:514-537 


Oil 


7m 


vip 7jnc transnorter 


6.6e-20 


79.6 


1 


1-146 






Ni lr Iftntidvl transferase 
domain 


8.5e-13 


55.9 


1 


67-174 


286 


zf-C2H2 


Zinc ringer, C2H2 type 


2.8e-93 


323.3 


12 


118-140:146- 

168:174- 

196:202- 

224:230- 

252:258- 

280:286- 

308:314- 

336:342- 

364:370- 

392:398- 

420:426-448 


286 


KRAB 


KRAB box 


3.6e-38 


140.2 


1 


8-70 


287 


zf-C2H2 


Zinc finger, C2H2 type 


5.3e-124 


425.4 


17 


183-205:211- 

233:239- 

261:267- 

289:295- 

317:323- 

345:351- 

373:379- 
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401:407- 
429:435- 
457:463- 
485:491- 
513:519- 
541347- 
569:575- 
597:603- 
625:631-653 


289 


DiHfolate_red 


Dihydrofolate reductase 


7.4e-77 


268.8 


1 


4-185 ! 


291 


PDZ 


PDZ domain (Also known as 
DHRorGLGF) 


7.4e-17 


69.4 


1 


5-84 


293 


PH 


PH domain 


1.4e-08 


355 


1 


44-147 


294 


adh short 


short chain dehydrogenase 


3.9e-29 


110J2 


L l 


36-284 


297 


PKD 


PKD domain 


9.9e-09 


42.4 


2 


663-753:756- 
839 


297 


BNR 


BNR repeat 


3.2e-06 


34.1 


5 


115-126:156- 
167:351- 
362:428- 
439:470-481 


300 


HMG_box 


HMG (high mobility group) 
box 


5.4e-05 


20.0 


1 


245-304 


301 


ig 


Immunoglobulin domain 


0.05 


11.6 


1 


629-688 


302 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


5e-12 


43.2 


1 


39-79 


303 


START 


START domain 


0.015 


4.1 


1 


1790-1994 


304 


integrate 


Integrase DNA binding 
domain 


7.2e-06 


32.9 


1 


51-96 


305 


myosiiuhead 


Myosin head (motor domain) 


7.6e-279 


939.7 


2 


11-668:689- 
733 


306 


zf-C2H2 


Zinc finger, C2H2 type 


8.5e~54 


192.1 


7 


66-88:94- 
116:122- 
144:150- 
172:178- 
200:280- 
303:317-339 


307 


ig 


Immunoglobulin domain 


0.00023 


19.1 


2 


35-104:136- 
194 


309 


ras 


Ras family 


0.00079 


-933 


1 


38-176 


310 


if? 


Immunoglobulin domain 


2.1e-06 


25.7 


1 


37-112 


311 


EF1BD 


EF-1 guanine nucleotide 
exchange domain 


4.7e-56 


199.6 


1 


139-225 


312 


BTB 


BTB/POZ domain 


8.4e-25 


95.8 


1 


51-164 


313 


zf-C2H2 


Zinc finger, C2H2 type 


7.7e-59 


208.9 


9 


118-140:197- 

219:281- 

303:309- 

331:337- 

359:365- 

387:393- 

415:421- 

443:449-471 


313 


KRAB 


KRABbox 


1.4e-17 


71.8 


1 


41-99 
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314 


Hydrolase 


haloacid dehalogenase-like 
hydrolase 


0.045 


85 


1 


213-671 


315 


cNMPJrinding 


Cyclic nucleotide-binding 
domain 


4e-26 


1005 


1 


387-475 


315 


ion trans 


Ion transport protein 


3.8e-19 


77.0 


1 


69-290 


316 


Peptidase_S26 


Signal peptidase I 


2.8e-16 


563 


2 


38-98:117- 
139 


317 


zf-C2H2 


Zinc finger, C2H2 type 


4.3e-56 


199.8 


9 


156-178:184- 

206512- 

234540- 

262568- 

290:296- 

318:324- 

346:352- 

374:378-400 


317 


KRAB 


KRAB box 


6.7e-16 


66.3 


1 


11-73 


319 


UPF0073 


Uncharacterised protein 
family 


1.8e-09 


27.9 


1 


33-276 


320 


EGF 


EGF-like domain 


4.7e-08 


40.2 


1 


26-59 


321 


lectin_c 


Lectin C-type domain 


8.6e-15 


62.6 


1 


268-374 


325 


MAM 


MAM domain 


1.3e-52 


188.2 


1 


338-503 


325 


ig 


Immunoglobulin domain 


1.9e-15 


54.8 


3 


41-101:138- 
202:346-420 


327 


MAM 


MAM domain 


53e-180 


611.4 


4 


26-169:170- 

329:342- 

498:509-666 


328 


Sema 


Sema domain 


Ue-211 


716.2 


1 


56-491 


329 


zf-eim 


Zone finger, C2H2 type 


L5e-84 


2943 


13 


170-192:198- 

220:226- 

248554- 

276582- 

304:310- 

332338- 

360:366- 

388:394- 

416:422- 

444:450- 

472:478- 

500:506-528 


331 


PAP2 


PAP2 superfamily 


8e-22 


85.9 


1 


160-314 


332 


LRR 


Leucine Rich Repeat 


3.4e-36 


133.7 


11 


58-81:82- 

105:106- 

129:130- 

153:154- 

177:178- 

201502- 

225550- 

273574- 

297598- 

321:322-345 


332 


ifi 


Immunoglobulin domain 


2.5e-08 


31.9 


1 


425-485 


332 


LRRNT 


Leucine rich repeat N- 


2.5e-05 


31.1 


1 


27-56 
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terminal domain 










332 


LRRCT 


Leucine rich repeat C- 
terminal domain 


0.0029 


24.3 


1 


355-408 


333 


AdoHcyase 


S-atenosyl-L-homocysteine 
hydrolase 


1.5e-280 


945.4 


1 


214-640 


334 


TBC 


TBC domain 


9.4e-38 


138.9 


\ 


89-302 


341 


WD40 


WD domain, G-beta repeat 


0.00094 


25.9 




2-32:109-146 


342 


ABC1 


ABC1 family 


0.051 


-29.9 


i 


3-50 


344 


globin 


Globin 


3e-45 


162.2 


i 


M41 


345 


globin 


Globin 


7.5e-39 


139.9 




1-31:68-179 


347 


F-box 


F-box domain 


1.5e-07 


38.5 


i 


24-72 


348 


HLH 


Helix-loop-helix DNA- 
binding domain 


2e-08 


41.4 


1 


83-137 


349 


KRAB 


KRAB box 


2.7e-39 


144.0 




4-66 


350 


UCH-2 


Ubiquitin carboxyl-terminal 
hydrolase family 


1.7e-19 


78.2 


i 

_ 


645-705 


350 


UCH-1 


Ubiquitin carboxyl-terminal 
hydrolases famil 


9.1e-15 


62.5 




363-394 


350 


zf-UBP 


Zn-finger in ubiquitin- 
hydrolases and other 


0.00069 


18.9 


i 

_ 


236-306 


351 


NUDIX 


MutT-like domain 


8.2e-12 


52.7 




50-200 


352 


IBR 


IBR domain 


1.6e-12 


55.0 


j 


101-166 


353 


ffiR 


IBR domain 


1.6e-12 


55.0 




66-131 


354 


SCP 


SCP-like extracellular protein 


1.4e-34 


128.3 


1 


56-208 


356 


mito_carr 


Mitochondrial carrier protein 


9.7e-78 


271.7 




10-125:127- 
220:232-321 


358 


UCH-1 ' 


Ubiquitin carboxyl-terminal 
hydrolases famil 


5.1e-15 


63.3 




323-354 


358 


zf-UBP 


Zn -finger in ubiquitin- 
hydrolases and other 


0.00049 


19.4 




195-264 


360 


Phagejysozyme 


Phage lysozyme 


0.0014 


234 




94-184 


362 


Ribosomal S2 


Ribosomal protein S2 


3.3e-08 


32.9 




20-62 


364 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


5.3e-09 


33.4 




291-329 


365 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


0.0096 


13.1 




109-148 




TPR 

X X XV 


TPR Domain 

X X iV L/U1UUU 


0.043 


20.4 




1-28 


370 


zf-C2H2 


Zincfmeer C2H2tvne 


53e-109 


3753 


14 


127-149:155- 

177:183- 

205:211- 

233:239- 

261:267- 

289:295- 

317:323- 

345351- 

373:379- 

401:407- 

429:435- 

457:463- 

485:491-513 


370 


SCAN 


SCAN domain 


4.2e-38 


140.0 


1 


27-122 


371 


arf 


ADP-ribosylation factor 


4.9e-39 


143.1 


1 


6-184 
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family 










371 


ras 


Ras family 


7.2e-06 


-70.1 


1 


22-186 


372 


BNR 


BNR repeat 


0.031 


20.9 


3 


171-182:244- 
255:295-306 


373 


zf-C2H2 


Zinc finger, C2H2 type 


83e-25 


95.8 


5 


142-162:171- 
198:204- 
228:234- 
258:264-288 


376 


mn 


RNA recognition motif. 


0.00019 


28.2 


1 


112-163 


377 


mn 


RNA recognition motif. 


2.2e-19 


77.9 


1 


112-183 


380 


vwc 


von Willebrand factor type C 
domain 


1.6e-31 


118.2 


3 


22r76:79- 
134:137-192 


381 


Ribosomal L35Ae 


Ribosomal protein L35 Ae 


0.00013 


7.0 


1 


1-79 


385 


ras 


Ras family 


3.9e-63 


223.2 


1 


35-229 


385 


arf 


ADP-ribosylation factor 
family 


1.76-05 


-46.9 


1 


18-202 


388 


F-box 


F-box domain 


1.5e-G5 


31.9 


2 


23-70:99-146 


390 


SPRY 


SPRY domain 


6.2e-10 


46.4 


1 


101-239 


391 


tRNA_Me_trans 


tRNA methyl transferase 


1.9e~19 


50.9 


1 


5-185 


392 


zf-C2H2 


Zinc finger, C2H2 type 


4e-17 


70.3 


3 


175-197:203- 
225:231-253 


393 


SCAN 


SCAN domain 


3.1e-39 


143.8 


1 


389-484 


393 


SPRY 


SPRY domain 


1.8e-19 


78.1 


1 


148-273 


393 


zf-C2H2 


Zinc finger, C2H2 type 


4e-09 


43.7 


2 


759-781:787- 
809 


393 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


0.0032 


14.7 


1 


11-52 


394 


Kelch 


Kelch motif 


4e-53 


189.9 


5 


329-375:377- 
431:433- 
479:481- 
525:527-572 


394 


BTB 


BTB/POZ domain 


6.1e-26 


99.6 


1 


30-144 


395 


C2 


C2 domain 


2.2e-80 


280.4 


2 


159-251:296- 
384 


396 


ank 


Ank repeat 


5.6e-33 


123.0 


4 


47-79:80- 

112:140- 

174:175-207 


396 


PH 


PH domain 


8.9e-05 


22.0 


1 


236-334 


397 


ank 


Ank repeat 


1.7e-26 


101.4 


4 


17-49:50- 

82:83- 

115:116-148 


398 


Nucleoplasm^ 


Nucleoplasm^ j 


3.6e-29 


110.4 


1 


13-209 


400 


DAGKa 


Diacylglycerol kinase 
accessory domain 


15e-124 


426.8 


1 


598-778 


400 


DAGKc 


Diacylglycerol kinase 
catalytic domain 


7.1e-67 


235.6 


1 


454-578 


400 


DAGJ^bind 


Phorbol esters/diacylglycerol 
binding dom 


2.9e-23 


90.7 


2 


261-310:326- 
374 1 


400 


efhand 


EFhand 


2.4e-12 


54.4 


2 


169-197:214- 
242 


403 


PDZ 


PDZ domain (Also known as 


7.7e-46 


165.7 


3 


86-166:210- 
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table 4 



SEQ 
ID 


Pf am Model 


Description 


E- value 


Score 


No: of 

Pfam 

Domains 


Position of 
the Domain 






DHR or GLGF) 








291:821-907 


404 


zf-C2H2 


Zinc finger, C2H2 type 


2.6e-48 


173.9 


7 


172-194:200- 

222:228- 

250:256- 

278:284- 

306:312- 

331:340-362 


405 


Kjetra 


K+ channel tetramerisation 
domain 


2.6e-23 


90.9 


1 


51-146 


406 


SNF 


Sodium:neuro transmitter 
symporter family 


0 


1268.7 


1 


60-657 


407 




Immunoglobulin domain 


l.le-06 


26.5 


1 


53-120 


408 


DnaJ 


DnaJ domain 


2.3e~27 


104.3 


1 


4-68 


408 

TWO 


DnaJ C 


DnaJ C terminal region 


3.1e-08 


38.1 


1 


192-314 


409 


mito_carr 


Mitochondrial carrier protein 


1.4e-57 


204.7 


3 


5-100:102- 
201:205-302 


410 




Zinc finger, C2H2 type 


5.2e-97 ' 


335.7 


12 


141-163:169- 

191:197- 

219:225- 

247:253- 

275:281- 

303:309- 

331:337- 

359:365- 

387:393- 

415:421- 

443:449-473 


411 


S_1G0 


S-100/ICaBP type calcium 
binding domain 


9.7e-13 


55.8 


1 


5-48 


411 


efhand 


EFhand 


0.0012 


25.6 


1 


54-82 


413 


m3 


Fibronectin type HI domain 


8.6e-14 


59.3 


2 


22-107:119- 
196 


413 


PHD 


PHD-finger 


9.6e-05 


272 


1 


285-341 


414 


zf-C2H2 


Zinc finger, C2H2 type 


2.3e-27 


104.4 


6 


42-64:117- 

139:145- 

167:173- 

196:534- 

556:573-595 


415 


SPRY 


SPRY domain 


3.9e-18 


73.7 


1 


347-467 


415 


2f-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


4.4e-14 


49.9 


1 


16-56 


415 


zf-B_box 


B-box zinc finger 


9e-07 


35.9 


1 


92-133 


416 


pkinase 


Protein kinase domain 


1.2e-54 


195.0 


1 


97-317 


417 


trypsin 


Trypsin 


4.6e-38 


1223 


1 


41-234 


418 


GJypican 


GJypican 


5.7e-131 


448.5 


1 


3-244 


419 


Keratin J32 


Keratin, high sulfur B2 
protein 


0.0013 


-23.4 


1 


37-159 


420 


Dyneinjieavy 


Dynein heavy chain 


0 


1432.3 


1 


309-1019 


421 


zf-C2H2 


Zinc finger, C2H2 type 


0.00039 


27.2 


3 


75-99:203- 
227:266-290 


422 




Immunoglobulin domain 


0.00074 


ns 


1 


34-107 


423 


fh3 


Fibronectin type HI domain 


6e-08 


39.8 


1 


443-531 
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Table 4 



SEQ 
ID 


Pfam Model 


Description 


E- value 


Score 


No: of 

Pfam 

Domains 


Position of 
the Domain 


424 


Keratin 32 


Keratin, high sulfur B2 
protein 


0.0023 


-27.1 


2 


5-150:152- 
251 


425 


pkinase 


Protein kinase domain 


2.3e-55 


197.3 


1 


69-390 


426 




Immunoglobulin domain 


4.1e-09 


34.4 


1 


35-112 


427 


GalactosyLT 


Galactosyl transferase 


2.6e-35 


130.8 


1 


158-349 


428 


proteasome 


Proteasome A-type and B- 
type 


53e-28 


106.4 


1 


96-238 


429 


7tmJ 


7 transmembrane receptor 
(rhodopsin family) 


3.4e-38 


1233 


1 


41-290 




D ID 




O. 1C*AJ 


89.2 


I 


58-173 


430 


zf-C2H2 


Zinc finger, C2H2 type 


4.3e-07 


37.0 


2 


472-494:500- 

JZ>J 


All 


p4Du 


i^yiocnrome r**ju 






i 




434 


cncrnr It 


Supar (and other! transoorter 


2.6e-64 


227.1 


I 


10-512 * 


435 


zf-C2H2 


Zinc finger, C2H2 type 


1.8e-52 


187.8 


9 


287-309:315- 

337:546- 

568:574- 

596:606- 

628:844- 

866:872- 

894:980- 

1002:1008- 

1030 


436 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


2.2e-40 


130.4 


2 


82-221:229- 
284 


437 


FGF 


Fibroblast growth factor 


4.6e-14 


51.6 


1 


48-129 


438 


Osteopontin 


Osteopontin 


3.7e-181 


615.2 


1 


1-294 
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PDB annotation 


TRANSCRIPTION 
REGULATION PROTO- 
ONCOGENE, NUCLEAR 
BODIES (PODS), LEUKEMIA, 
2 TRANSCRIPTION 
REGULATION A 


TRANSFERASE HRS; HRS, V « 
VHS, FYVE, ZINC FINGER, 
SUPERHELIX 


METAL BINDING PROTEIN 
RING FINGER PROTEIN 
MATU RING FINGER 
(C3HC4) 




COMPLEX 

(ISOMERASE/DIPEPTIDE) 
PINl; PEPTIDYL-PROLYL 
CIS-TRANS ISOMERASE, 
ROTAMASE, 2 COMPLEX 
(ISOMERASE/DIPEPTIDE) 
CONECT 




COMPLEX . T 
(APOPTOSIS/PEPTIDE) fl 
APOPTOSIS, ALTERNATIVE | 
SPLICING, COMPLEX \ 
(APOPTOSIS/PEPTIDE) J 


APOPTOSIS HELICAL Jj J 
PROTEIN H 

Hi 


APOPTOSIS APOPTOSIS 
REGULATOR BCL-X; g j 
APOPTOSIS, PROGRAMMED ij , 
CELL DEATH, BCL-2 « , 
FAMILY J] 




STRUCTURAL PROTEIN fl 1 


Compound 


TRANSCRIPTION FACTOR 
PML; CHAIN: NULL; 


HEPATOCYTE GROWTH 
FACTOR-REGULATED 
TYROSINE CHAIN: A: 


CDK-ACTIVATING 
KINASE ASSEMBLY 
FACTORMATl; CHAIN: A; 




Q w jj 


if 
II 

-99 
CP 

Ie 

Jo 




BCI^XL; CHAIN: A; BAK 


Q 

a 


APOPTOSIS REGULATOR 
BAX, MEMBRANE 
ISOFORM ALPHA; CHAIN: 
A; 


BCL-XL; CHAIN: NULL; 




ALPHA SPECTRIN; CHAIN: 


SEQFOLD 
score 






















81.85 


PMF 
score 


o 
© 


0.76 


0.62 




0.23 




100 


0.10 


0.03 






Verify 
score 


-0.58 


-0.23 


-0.44 




9 




s 

9' 


oo 

9 


-0.03 






Psi 
Blast 


8 

Z 


0.0061 


1.3e-06 




NO 

o 

<£> 
oo 




OO 

cn 

<6 

«r> 


NO 


I 

cn 




i— t 

CN 




s 




8 




NO 
OO 




CN 

NO 

m 


8 

cn 


cn j 

NO 

cn 




CN 


START 
AA ■ 


o 




o 

f-H 




OO 




a 

cn 


oo 
o 

CN 






ON 


CHAIN 
ID 




< 


< 




< 




< 


< 






< 


M 


lbor 


ldvp 


oo 
«— ■» 




Ipin 




lbxl 


NO 


lmaz 




| 


SEQID 
NO: 






«— « 








< 




i— « 
«— « 




CN 
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! 



1 

3 

CO 



U 
PQ 




oo 



8* 



§5 



as 



Pa 



si 

CO 



8d & & 





I 5 I I I I I I 




ODOBOiOnOaOiOROiO 
50wO.O;.OpqOTlO;.Ocq0 7.0w 

^2^52^52^2 2 ^2j2^2hj2- 
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332 fit £& §! 
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So: 



L . ■ oo . . E 113 



o g 5 eg o o 
pl, 2 ffi < a* E 



o 

© 
U 





obooohosooodo r.o 

0;.OwO..OTlOwOtijO<0 



2§3 



§3 



II 



s s 

to 



15 



CO 



CO 
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§ 
■a 

3 

© 

g 

CO 




§1' 

Ph o* u 




>! H S 
•J P P 




OS 1 




o 

I 

o 
U 



s 



Is 

(J 



s 



8 



? s 



| 8 



flu 5 



3 



§5 






is 

■5 



5? co 1 



a 
w 



CO 



s 



8 



6 



o 



ft 



VO 



to 
VO 



Ov 

d 



o 



o 



oo 
o 



CO 
VO 



00 

d 



VO 



cn 



.id 
.o 



VO 



o 



247 



WO 02/081731 



PCT/US02/01222 




3 <=>" £ F e « 

J r- a, H 3 Q 





sill 




lllPi 

mm 



o 

l 







* C 

« CI 

> 28 



1 



5« 



a 



CO 



8 

© 



?! 



8 
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PDB annotation 


BINDING/EFFECTOR), G 
PROTEIN, EFFECTOR. 
RABCDR, 2 SYNAPTIC 
EXOCYTOSIS. RAB 
PROTEIN, RAB3 A, 

PARPWTT TM 


HYDROLASE G PROTEIN. J 
VESICULAR TRAFFICKING, ^ 
GTP HYDROLYSIS, RAB 2 
PROTEIN, 

NEUROTRANSMITTER 
RELEASE, HYDROLASE 


TRANSCRIPTION 
REGULATION SIGMA70; 
RNA POLYMERASE SIGMA 
FACTOR, TRANSCRIPTION 
REGULATION 


COMPLEX (BLOOD 
COAGULATION/INHIBITOR) 
AUTOPROTHROMBIN HA; 
HYDROLASE, SERINE m 
PROTEINASE). PLASMA " 
CALCIUM BINDING, 2 D 
GLYCOPROTEIN, COMPLEX*} 
(BLOOD 


ANTI-COAGULANT ANTI- ij 


3 cm w e 

i ill 
lilii 

05032 


SUGAR BINDING PROTEIN U 
UDA; LECTIN. HEVEIN « 
DOMAIN, UDA, S ~ 
SUPERANTIQEN J* 
SUGAR BINDING PROTEIN • H 


Compound 




RAB3A; CHAIN: A; 


RNA POLYMERASE 
PRIMARY SIGMA 
FACTOR; CHAIN: NULL; 


ACTIVATED PROTEIN C; 
CHAIN: C, L; D-PHE-PRO- 
MAI; CHAIN: P; 


HIRUSTASIN; CHAIN: 




AGGLUTININ ISOLECTIN 
VI/AGGLUTININ 
ISOLECTIN V; CHAIN: A; 

AGGLUTININ ISOLECTIN 


SEQ FOLD 
score 




oo 

ON 

On 
»— • 


79.28 


50.24 






PMF 
score 










-0.17 


58 g 
d © 


> «\ 










0.03 


0 9 


Psi 
Blast 




00 
NO 
i 

« 


5.2e-05 


oo 
O 
Z 

NO 
CM* 




On O 
O «— « 
6 0 

NO On 
CM ro 


s < 

%< 




00 
OS 
»— 1 


CM 

9 


CO 

co 

CM 


s 

.— » 


On en 


START 
AA 






ON 
i— « 

f— 1 


00 
CM 


in 


«n 

l—l — H 


CHAIN 
ID 




< 




•J 




< < 


■a 


< 


2 

■n 


lsig 


laut 


i 

l-H 


leis 


SEQ ID 

NO: 




cf 






- * 1 


— « 
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PDB annotation 


UDA; LECTIN, HEVEIN 
DOMAIN. UDA. 


SUPERANTIGEN 


SUGAR BINDING PROTEIN 
UDA; LECTIN, HEVEIN 
DOMAIN, UDA, 

SUPERANTIGEN M 


SUGAR BINDING PROTEIN >g ■ 
UDA; LECTIN, HBVEIN 
DOMAIN, UDA, 
SUPERANTIGEN, 
SACCHARIDE BINDING 


SUGAR BINDING PROTEIN 
UDA; LECTIN, HEVEIN 
DOMAIN, UDA, 
SUPERANTIGEN, 
SACCHARIDE BINDING 


SUGAR BINDING PROTEIN 
UDA; LECTIN, HEVEIN 
DOMAIN, UDA, 
SUPERANTIGEN, 
SACCHARIDE BINDINO 


GLYCOPROTEIN 
GLYCOPROTEIN "0 




COMPLEX (BLOOD V! 
COAGULATION/INHIBITOR 
CHRISTMAS FACTOR; fy 
COMPLEX, INHIBITOR, v 
HEMOPHILIA/EGF, BLOOD 
COAGULATION, 2 PLASMj^H 
SERINE PROTEASE, P* 
CALCIUM-BINDING, fU 
HYDROLASE, 3 fy 
GLYCOPROTEIN 


Compound 


VI/AGGLUTININ 
TSfTLETTTN CHAIN: A: 


> 


AGGLUTININ ISOLECTIN 
VI/AGGLUTININ 
ISOLECTIN V; CHAIN: A; 


AGGLUTININ ISOLECTIN 
I/AGGLUTININ ISOLECTIN 
V/ CHAIN: A; 


AGGLUTININ ISOLECTIN 
I/AGGLUTININ ISOLECTIN 
V/ CHAIN: A; 


AGGLUTININ ISOLECTIN 
I/AGGLUTININ ISOLECTIN 
V/ CHAIN: A; 


LAMININ; CHAIN: NULL; 


AGGREGATION 
INHIBITOR, GP 
ANTAGONIST KISTRIN 
(NMR, 8 STRUCTURES) 
IKST 3 


FACTOR DCA; CHAIN: C, 
L,; D-PHE-PRO-ARG; 
CHAIN: I; 


SEQ FOLD 
score 












70.40 




59.55 


PMF 
score 




0.41 


0.62 


0.11 


0.03 




0.00 




Verity 
score 




-0.10 


-0.12 


-0.71 


8 
9 




-0.45 




Psi 
Blast 




oo 
O 

6 

cn 


o 

6 
en 


■—I 
ON 


o 

£ 

cn 

»— 4 


CN 

% 

m 
vb 


s 

i 

6 
cn 


3.9e-16 






Os 
vo 


3 




ON 

vo 


cn 

•N 




oo 


START 
AA 




© 

ON 


cn 

i-H 




oo 


m 

VO 


On 




CHAIN 
ID 




< 




< 


< 












leis 


1 


1 

»— < 


1 


lklo 


lkst | 


lpfx 


SEQ ID 
NO: 






5 


▼—» 




5 


5 


5 
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PDB annotation 






GLYCOPROTEIN 1 
MEMBRANE COPACTOR . M 


PROTEIN (MCP); VIRUS \ ■ 
RECEPTOR. COMPLEMENT 
COFACTOR, SHORT 
CONSENSUS REPEAT, 2 SCR, 
MEASLES VIRUS, 
GLYCOPROTEIN 


GLYCOPROTEIN 
MEMBRANE COPACTOR . 
PROTEIN (MCP); VIRUS 
RECEPTOR, COMPLEMENT 
COPACTOR, SHORT 
CONSENSUS REPEAT, 2 SCR, 
MEASLES VIRUS, 
GLYCOPROTEIN 1 


HYDROLASE/HYDROLASE 
INHIBITOR PROTEIN- 
PEPTIDE COMPLEX *$} 


COMPLEMENT INHIBITOR fcTJ 
VCP, SP35; COMPLEMENT, Q 
NMR, MODULES, PROTEIN m 
STRUCTURE, VACCINIA J, 

virus J: 


l & w"^ 

iC|B B 

U > Z co > 


Compound 


MEMBRANE PROTEIN 
VITELLINE MEMBRANE 
OUTER LAYER PROTEIN I 
IVMO 3 


\ 


CD46; CHAIN: A, B, C, D, E, 
F; 


CD46; CHAIN: A, B, C, D, E, 
F; 


DES-GLA FACTOR VHA 
(HEAVY CHAIN); CHAIN: 
H, I; DES-GLA FACTOR 
VIIA (LIGHT CHAIN); 
CHAIN: L, M; (DPN)-PHE- 
ARG; CHAIN: C,D; 
PEPTIDE E-76; CHAIN: X, 
Y: 


COMPLEMENT CONTROL 


< 

i. 


COMPLEMENT CONTROL 
PROTEIN; CHAIN: A; 


SEQ FOLD 
score 
















PMF 
score 


o\ 
9 




-0.08 


0.94 

i 


0.04 


0.27 


-0.02 


Verify 
score 


0.04 




0.31 


-0.09 


0.14 


0.40 


0.29 


Psl 
Blast 


oo 

3 

u-i 




\o 
ci 


ro 


CM 

■ 

I 

u 
in 


o\ 
ci 


<s 
*- < 

6 

NO 

ci 


H < 


no 

OS 




1597 


1647 


S 


1523 


1596 


START 
AA 


oo 
cs 




1464 


1525 


NO 

cn 
oo 


1395 


1467 


CHAIN 
ID 


< 




< 


<: 




< 


< 




Ivmo 




Ickl 


Ickl 


ldva 


"8 


oo 

•a 


SEQ ID 
NO: 






5 


3 




5 
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PDB annotation 


ADHESION PROTEIN, 
TRANSMEMBRANE, 2 
GLYCOPROTEIN 




GLYCOPROTEIN 
GLYCOPROTEIN 


SERINE PROTEASE FVIIA; 
FVIIA; BLOOD 
COAGULATION, SERINE 
PROTEASE 


METAL BINDING PROTEIN 
BETA SANDWICH, 
CALCIUM-BINDING 
PROTEIN, METAL BINDING 
2 PROTEIN 


MEMBRANE ADHESION M 
SHORT CONSENSUS f\ 
REPEAT, SUSHI. „J 
COMPLEMENT CONTROL 
PROTEIN, 2N- JQ 
GLYCOSYLATION, MULTI- h 
DOMAIN, MEMBRANE £ 
ADHESION C 


MEMBRANE ADHESION ft 
SHORT CONSENSUS 
REPEAT, SUSHI, f= 
COMPLEMENT CONTROL y 
PROTEIN, 2 N- L 
GLYCOSYLATION, MULTI- J} ! 
DOMAIN, MEMBRANE ft 
ADHESION ft 


Compound 




GLYCOPROTEIN FACTOR 
H, 15TH CMODULE PAIR 
fNMR. MINIMIZED 


AVERAGED IHFIA 1 
STRUCTURE) 1HFI4 IHFIA 
5 


LAMININ; CHAIN: NULL; 


COAGULATION FACTOR 
VIIA (LIGHT CHAIN); 
CHAIN: L; COAGULATION i 
FACTOR VDA (HEAVY 
CHAIN); CHAIN: H; 
TREPEPTIDYL INHIBITOR; 
CHAIN: C: 


LAMININ ALPHA2 CHAIN; 
CHAIN: A,B,C, D; 


HUMAN BETA2- 
GLYCOPROTE1N I; CHAIN: 
A; 


HUMAN BETA2- 
GLYCOPROTBIN I; CHAIN: 
A; 


SEQFOLD 
score 












113.83 




PMF 
score 




0.05 




0.03 


cs 

rH 

d 
i 




0.22 


Verify 
score 




0.04 


0.00 

• 


0.14 


J -0.00 




0.16 


Psi 
Blast 




VO 
CS 


o 


CO 


cs 

i—i » 

* 

QO 


Ok 

rH 

cn 


cs 

rH 






1596 


8 


s 


5 


1780 

i 


1709 


START 
AA 




1522 


«r> 
m 
m 


rH 
00 


S3 

CS 


1456 

i 
1 


1461 


CHAIN 
ID 










< 


< 


< • 


g s 


i 


2 


a 1 


a 

cr 
rH 


§ 

cr 


lqub 


lqub 


SEQID 
NO: 




$ : 


3" 


i 


3 
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CO CO 
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go 




to m 

8S-B 










o o , 

# 1 

Ph S P4 I 



O co O < 

gsgs 

6 n < 

liii 

922S 
££££ 



1 



a 2 
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5 



5 



r 

On 
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ON 



5 
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PDB annotation 


SEVEN-STRANDED 
INCOMPLETB 
ANTIPARALLEL UP- AND- 
DOWN BETA 2 BARREL, 
ACTIN-BINDING PROTEIN, 
POLY-L-PROLINE BINDING 
3 PROTEIN, PIP2 BINDING d 
PROTEIN V 


PROTEIN BINDING ^ 
ACETYLATION, ACTIN- 1 
BINDING PROTEIN, 
MULTIGENEFAMILY 


1 

C 

e 


ACTIN-BINDING PROTEIN 




ACTIN-BINDING PROTEIN 
ACTIN-BINDING PROTEIN, 
PROFILIN. CYTOSKELETON 


ACTIN-BINDING PROTEIN 
ACTIN-BINDING PROTEIN, 
PROFILIN, CYTOSKELETON 


BE 

O BO 
P« co A* 

0 p O 

1 I 

ill 
ill 




TARGETING PROTEIN PICU J 
GMP1, UBLl, SENTRIN; Jfl 
SUMO-1, POST- 
TRANSLATIONAL PROTEttW 
MODIFICATIONS Q 
UBIQUITTN-LIKE PROTEINSfU 
TARGETING PROTEIN V 


015 

111 
|ii 

r n1 m . 

OH w 

85 3 a 
Sees * 

CO P s 


DE NOVO PROTEIN fU 
PROTEIN DESIGN, ft] 


Compound 




PROFILIN; CHAIN: NULL; 


< 

i 


* 


ACTIN BINDING PROTEIN 
PROFILIN IPNE 3 


< 

i 


PROFILIN; CHAIN: A, B; 


PROFILIN I; CHAIN: NULL; 




SUMO-1; CHAIN: NULL; 


1 UBIQUTTIN-LIKE PROTEIN 


7,RUB1;CHAIN:A; 


1D8 UBIQUTTIN; CHAIN: A; 


SEQFOLD 
score 












70.17 












PMF 
score 




0.94 


0.99 


0.98 


1.00 




1.00 




0.00 


1.00 


1.00 


Verify 
score 




0.15 


0.66 


0.41 


0.54 




0.72 




0.28 


1.02 


0.86 


Psi 
Blast 




<s 

1 

«~» 


le-45 


o 
o 

CN 


00 
CO 
1 

tt> 
Tf 

co 


oo 
CO 

co 


6 

rn 




vn 
O 

6 
oo 

VO 




i— i 

CO 

6 












m 


On 
<N 


«— i 




8 


s 


s 


START 
AA 




VO 


NO 




2 


»n 


< 




ON 


co 


f-l 
CO 


CHAIN 
ID 






< 




< 


< 








< 


< 


fe 




E 




lpne 




lypr 


3nul 




l-i 


IbtO 




SEQID 
NO: 




5 


3 




s 


3 


3 








5 
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COMPLEX (NUCLEOCAPSID 
PROTEIN/DNA) (12-53)NCP7; 
COMPLEX (NUCLEOCAPSID 
PROTEIN/DNA), NUCLEIC 
ACID, 2 RETROVIRUS, M 
VIRUS MORPHOGENESIS, V 

7TMP PTMHTO 


ANTIFREEZE PROTEIN CSPB 
BETA BARREL, 


> 

> 
5 

| 




- . PCI 

1 


III 

ggjE 

iiljj 


NUCLEOCAPSID PROTEIN U 
NUCLEOCAPSID PROTEIN, fj 
HTV-2, RNA RECOGNITION, \ 
ZINC FINGER f } 


III? 


HIV-l NUCLEOCAPSID 
PROTEIN (MN ISOLATE) 
(NMR, 20 STRUCTURES) 
1AAF3 


DNA (ACGCC); CHAIN: D; 
NUCLEOCAPSID PROTEIN 
7; CHAIN: A; 




COLD-SHOCK PROTEIN; 
CHAIN: A, B; 


TRANSCRIPTION 
REGULATION MAJOR 
COLD SHOCK PROTEIN 
(CSPB) 1CSP3 


TRANSCRIPTION 
REGULATION MAJOR 
COLD SHOCK PROTEIN 
(CSPB)1CSP3 


r TRANSCRIPTION 
! REGULATION MAJOR 
COLD SHOCK PROTEIN 7.4 
(CSPA (CS 7.4)) OF IMJC 3 
(ESCHERICHIA COLT) 
IMJC 4 


NUCLEOCAPSID PROTEIN; 


CHAIN: NULL; 


i 

1 




HLA-A 0201; CHAIN: A; 
BETA-2 MICROGLOBULIN; 
CHAIN: B; TAX PEPTIDE; 
CHAIN: C;TCELL 
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PDB annotation 






I HYDROLASE MACHE; 
HYDROLASE, SERINE 
ESTERASE, 

ACETYLCHOLINESTERASE, 
TETRAMER, 2 HYDROLASE 
FOLD, GLYCOSYLATED 
! PROTEIN 


HYDROLASE PNB 
ESTERASE; ALPHA-BETA I 


HYDROLASE DIRECTBD *% \ 
EVOLUTION Ps 


HYDROLASE ALPHA BETA J 








Compound 


(E.C.3.1.1.3) COMPLEXED 
WITH COLIPASEAND 
INHIBITED ILPB 3 BY 
UNDECANE 

PHOSPHON ATE METHYL 
ESTER (TWO 

CONFORMATIONS) ILPB 4 


HYDROLASE LIPASE 
(E.C.3.1.1.3) 
(TRIACYLGLYCEROL 
LIPASE) COMPLEXED 
WITH ILPP 3 

HEXADECANESULFONAT 
E ILPP 4 ILPP 71 


ACETYLCHOUNESTERAS 
E; CHAIN: A,B, C,D; 


PARA-NTTROBENZYL 
ESTERASE; CHAIN: A; 




PROLYL 

AMINOPBPTIDASE; 
CHAIN: A; 


HYDROL ASE(CARB OXYLI 
C ESTERASE) LIPASE 
(E.C.3.1.1.3) 
TRIACYLGLYCEROL 
HYDROLASE ITHG 3 


HYDROLASE(C ARB OXYLI 
C ESTERASE) LIPASE 
(E.C.3.1.1.3) 


SEQFOLD 


e 
S 




64.96 


62.20 


66.32 
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PDB annotation 


SIGNALLING PROTEIN 
BINDING PROTEIN, 
CYTOKINE, SIGNALLING 
.PROTEIN 


HORMONE RECEPTOR 
HORMONE RECEPTOR, 
INSULIN RECEPTOR 
FAMILY f 


GLYCOPROTEIN 
GLYCOPROTEIN 


GLYCOPROTEIN 
GLYCOPROTEIN 


MEMBRANE ADHESION 
1 SHORT CONSENSUS 
REPEAT, SUSHI, 
COMPLEMENT CONTROL 
PROTEIN, 2 N- 
GLYCOSYLATION, MULTI- 
DOMAIN, MEMBRANE 
ADHESION 


SERINE PROTEASE 
INHIBITOR FACTOR XA 
INHIBITOR; ANTISTASIN, 
CRYSTAL STRUCTURE, ^ 
FACTOR XA INHIBITOR, 2 f= 
SERINE PROTEASE 
INHIBITOR, THROMBOSIS 7? 


SERINE PROTEASE I 
INHIBITOR FACTOR XA Q 1 
INHIBITOR; ANTISTASIN, Of 
CRYSTAL STRUCTURE, g 
FACTOR XA INHIBITOR, 2 n 
SERINE PROTEASE v 
INHIBITOR. THROMBOSIS J 


SERINE PROTEASE 
INHIBITOR FACTOR XA N 
INHIBITOR; ANTISTASIN, ft 
CRYSTAL STRUCTURE, fj 
FACTOR XA INHIBITOR. 2 ^ 






















Compound 


TUMOR NECROSIS 
FACTOR RECEPTOR; 
CHAIN: A,B; 


INSULIN-LIKE GROW 
FACTOR RECEPTOR! 
CHAIN: A; 




LAMININ; CHAIN: NL 


HUMAN BETA2- 
GLYCOPROTEIN I; CI 
A; 


ANTISTASIN; CHAIN: 
NULL; 


ANTISTASIN; CHAIN 
NULL; 




ANTISTASIN; CHAIN 
NULL; 


SEQ FOLD 
score 


54.40 






in 






CM 

*0 

oo • 
m 




PMF 
score 
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Verify 
score 
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0.84 




0.54 
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Iskz 
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FDB annotation 


DESIGN, 2 CRYSTAL 
STRUCTURE. COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER. 
PROTEIN-DNA 

INTERACTION, PROTEIN k 
DESIGN. 2 CRYSTAL K 
STRUCTURE, COMPLEX I 
(ZINC FINGER/DNA) ! 


COMPLEX (ZINC 

FINGER/DNA) ZINC FINGER. 

PROTEIN-DNA 

INTERACTION. PROTEIN 

DESIGN. 2 CRYSTAL 
l STRUCTURE. COMPLEX 
I (ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 1J 


COMPLEX (ZINC f\ 
FINGER/DNA) ZINC FINGER, i 
PROTEIN-DNA , 1 
INTERACTION, PROTEIN j 1 
DESIGN, 2 CRYSTAL Cf 
STRUCTURE, COMPLEX (J \ 
(ZINC FINGER/DNA) £) 


i 

1 5 3 
|l|p|| 


| COMPLEX (ZINC jn|| 


Compound 




DNA; CHAIN: A, B,- D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F. G; 


DNA; CHAIN: A. B. D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 

i 
1 


DNA; CHAIN: A.B.D.E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A.B.D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


< 

i" 


SEQ FOLD 
score 
















PMF 
score 




0.36 


0.28 


0.86 


0.70 


Cv 
i— i 

9 


0.34 


Verify 
score 
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0.06 
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0.18 
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1 Imey 
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PDB annotation 


FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GLI; GU ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) \4 


COMPLEX (DNA-BINDING Y : 
PROTEIN/DNA) FIVE- 
FINGER GLI; GLI, ZINC 
FINGER, COMPLEX (DNA- 
I BINDING PROTEIN/DNA) 




INSECT IMMUNITY INSECT 
IMMUNITY, LPS-BINDING, 
HOMOPHILIC ADHESION 


INSECT IMMUNITY INSECT 
IMMUNITY, LPS-BINDING, 
HOMOPHILIC ADHESION 


T-CELL SURFACE 
GLYCOPROTEIN 
IMMUNOGLOBULIN FOLD, 


1 

M 

III 

m 

-« 5> § CO 


T-CELL SURFACE vL 
GLYCOPROTEIN Jj 
IMMUNOGLOBULIN FOLD, 
TRANSMEMBRANE, W 
GLYCOPROTEIN, T-CELL, 2 £ 
MHC, LIPOPROTEIN, T-CEUfj. 
SURFACE GLYCOPROTEIN V 


CELL ADHESION NEURAL p 
CELL ADHESION T 


CELL ADHESION NEURAL £ 
CELL ADHESION IV 


GROWTH FACTOR/GROWTm 
FACTOR RECEPTOR POP, fl 


Compound 




ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 
CHAIN: C, D; 


ZINC FINGER PROTEIN 
GLI1; CHAIN: A; DNA; 
CHAIN: C,D; 




HEMOUN; CHAIN: A, B; 


HEMOUN; CHAIN: A, B; 


T-CELL SURFACE 
GLYCOPROTEIN CD4; 
CHAIN: NULL; 


T-CELL SURFACE 
GLYCOPROTEIN CD4; 
CHAIN: NULL; 


< 

I 


AXONIN-1; CHAIN: A; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A. B; 


SEQ FOLD 
score 
























PMF 
score 




0.70 


0.27 




0.09 


0.16 


0.36 


0.29 


0.36 


0.29 


0.22 


Verify 
score 
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© 


-0.31 
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PDB annotation 


DOMAINS, B-TREFOIL FOLD J 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF2; 
FGFR2; IMMUNOGLOBULIN 
(IG)LIKE DOMAINS 
BELONGING TO THE I-SET 2 
SUBGROUP WITHIN IG-LIKE f 
DOMAINS, B-TREFOILFOLD ^ 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGFl ; 
FGFRl; IMMUNOGLOBULIN 
(IG) LIKE DOMAINS 
BELONGING TO THE I-SET 2 
SUBGROUP WITHIN IG-LIKE 
DOMAINS, B-TREFOIL FOLD 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGFl; 
FGFRl; IMMUNOGLOBULIN 
(IG) LIKE DOMAINS 
BELONGING TO THE I-SET 2 
SUBGROUP WITHIN IG-LIKE 
DOMAINS, B-TREFOIL FOLD 


IMMUNE SYSTEM FC- 
EPSILON RI-ALPHA; % 
IMMUNOGLOBULIN FOLD, f 
GLYCOPROTEIN, 
RECEPTOR, IGE-BINDING 2 v 
PROTEIN J 


9 § 

o g 

S ■ s i 

^ £ 5 *7 \ 


IMMUNE SYSTEM HIGH £ ' 
AFFINITY IGE-FC 
RECEPTOR, FC(EPSILON) ft 
IGE-FC; IMMUNOGLOBULINfl 
FOLD, GLYCOPROTEIN, 


Compound 




FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, C, 
D; FIBROBLAST GROWTH 
FACTOR RECEPTOR 2; 
CHAIN: E f F, G,H; 


FIBROBLAST GROWTH 
FACTOR 1; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; 
CHAIN: C.D; 


FIBROBLAST GROWTH 
FACTOR 1; CHAIN: A,B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; 
CHAIN: C,D; 


HIGH AFFINITY 
IMMUNOGLOBULIN 
EPSILON RECEPTOR 
CHAIN: A; 


HIGH AFFINITY 
IMMUNOGLOBULIN 
EPSILON RECEPTOR 
CHAIN: A; 


HIGH AFFINITY 
IMMUNOGLOBULIN 
EPSILON RECEPTOR 
CHAIN: A; IGEPSILON 
CHAIN C REGION; CHAIN: 


SEQFOLD 
score 
















PMF 
score 




0.24 


0.05 


0.00 


0.06 


0.31 


0.25 


Verify 
score 




-0.10 


-0.54 
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00 
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PDB annotation 


IMMUNOGLOBULIN 
SUPERFAMILY, 
CARBOHYDRATE BINDING 


GLYCOPROTEIN CD4; 
IMMUNOGLOBULIN FOLD, 
TRANSMEMBRANE, 


GLYCOPROTEIN, T-CELL, 2 A 
MHC LIPOPROTEIN, ^ 
POLYMORPHISM 


GLYCOPROTEIN CD4; 
IMMUNOGLOBULIN FOLD. 
TRANSMEMBRANE, 
GLYCOPROTEIN, T-CELL, 2 
MHC LIPOPROTEIN, 
POLYMORPHISM 


IMMUNE SYSTEM CD32; 
RECEPTOR, FC.CD32, 
IMMUNE SYSTEM 


IMMUNE SYSTEM CD32; 
RECEPTOR, FC, CD32, 
IMMUNE SYSTEM 


CELL ADHESION NCAM 
DOMAIN 1; CELL 
ADHESION, X 
GLYCOPROTEIN, HEPARIN-f] 
BINDING, GPI-ANCHOR, 2 J 
NEURAL ADHESION k 
MOLECULE, J 
IMMUNOGLOBULIN FOLD, J*\ 
SIGNAL U| 


CELL ADHESION NCAM U 


DOMAIN 1; CELL \\ 
ADHESION, 

GLYCOPROTEIN, HEPARIN-** 
BINDING, GPI-ANCHOR, 2 £ 
NEURAL ADHESION J? 
MOLECULE, 

IMMUNOGLOBULIN FOLD, ft 
SIGNAL ft 


Compound 


U 
m 


T-CFU. SURFACE 
GLYCOPROTEIN CD4; 
CHAIN: A, B; 


T-CELL SURFACE 
GLYCOPROTEIN CD4; 
CHAIN: A, B; 


FC GAMMA RUB; CHAIN: 
A; 


FC GAMMA RIIB; CHAIN: 
A; 


NEURAL CELL ADHESION . 
MOLECULE; CHAIN: 
NULL; 


NEURAL CELL ADHESION 
MOLECULE; CHAIN: 
NULL; 


SEQFOLD 
score 
















PMF 
score 




0.52 


0.10 


0.11 


0.04 


0.28 


0.22 


Verify 
score 




8 

© 


-0.13 


-0.11 


-0.24 


-0.06 


0.28. 


Psi 
Blast 




o 
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CM j 
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PDB annotation 


UKE, SIGNAL 
TRANSDUCTION, 2 
DIMERIZATION, GROWTH 
FACTOR/GROWTH FACTOR 
RECEPTOR 


| growth factor/growth 1 
factor receptor fgf, i 
fgfr, immunoglobulin- v 
like, signal 
transduction, 2 
dimerization, growth 
factor/growth factor 

RECEPTOR 


COMPLEX CD16; IGGl-FC 
COMPLEX, FC FRAGMENT, 
IGG, FC RECEPTOR, CD16, 
GAMMA 


COMPLEX CD 16; IGGl-FC 
COMPLEX, FC FRAGMENT, 
IGG, FC, RECEPTOR, CD 16, 
GAMMA 


CELL ADHESION NCAM; *J| 
NCAM, IMMUNOGLOBULIN 55 !) 
FOLD, GLYCOPROTEIN ^4 


CELL ADHESION NCAM; g 
NCAM, IMMUNOGLOBULIN 
FOLD, GLYCOPROTEIN n 


CELL ADHESION NCAM; 
NCAM, IMMUNOGLOBULIN W 
FOLD, GLYCOPROTEIN 


CELL ADHESION NCAM; W 
NCAM, IMMUNOGLOBULIN 
FOLD. GLYCOPROTEIN fU 


GROWTH FACTOR/GROWTHj 
FACTOR RECEPTOR FGF2; ^, 


Compound 


FACTOR RECEPTOR 1; 
CHAIN: C, D; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; 
CHAIN: CD; 

1 
i 


LOW AFFINITY 
IMMUNOGLOBULIN 
GAMMA FC RECEPTOR 
CHAIN: C; FC FRAGMENT 
OF HUMAN IGGl; CHAIN: 
A,B; 


LOW AFFINITY 
IMMUNOGLOBULIN 
GAMMA FC RECEPTOR 
CHAIN: C; FC FRAGMENT 
OF HUMAN IGGl; CHAIN: 
A,B; 


NEURAL CELL ADHESION 
MOLECULE; CHAIN: A, B, 
CD; 


NEURAL CELL ADHESION 
MOLECULE; CHAIN: A, B, 
C,D; 


NEURAL CELL ADHESION 
MOLECULE; CHAIN: A, B, 
C,D; 


NEURAL CELL ADHESION 
MOLECULE; CHAIN: A, B f 
CD; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, C 


SEQFOLD 
score 




i 
















PMF 
score 




© 


0.12 


0.29 


0.09 
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— cy-g-y 




COMPLEX (ZINC f 
FINGER/DNA) ZINC FINGER, i 




PDB annotation 


FINGER, DNA-BINDING 
PROTEIN 


COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), ZINC 
FINGER, DNA-BINDING 
PROTEIN 


COMPLEX (ZINC 
FINGER/DNA) ZINC FTNGEI 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGEI 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGEI 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGEI 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) . 




Compound 


OLIGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 

C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 
C; 


DNA; CHAIN: A, B.D.E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B,D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F f G; 


DNA; CHAIN: A,B,D,E; 
CONSENSUS ZINC FINGER 




SEQ FOLD 
score 
















Table 5 


PMF 
score 




0.47 


1.00 


0.98 


0.89 


1.00 


1.00 




Verify 
score 




0.03 


0.13 


0.16 


-0.46 


0.40 


0.31 
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PDB annotation 


PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 

STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC j 
FINGER/DNA) ZINC FINGER, 1 
PROTEIN-DNA | 
INTERACTION, PROTEIN 1 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
\ (ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 1] 
STRUCTURE, COMPLEX ** 
(ZINC FINGER/DNA) * ! 


COMPLEX-(ZINC 7J 1 
FINGER/DNA) ZINC FINGER, J 
PROTEIN-DNA 1 
INTERACTION, PROTEIN ] 
DESIGN, 2 CRYSTAL f] 
STRUCTURE, COMPLEX p j 
(ZINC FINGER/DNA) i. 


COMPLEX (ZINC ' 
FINGER/DNA) ZINC FINGER, t J 
PROTEIN-DNA M 
INTERACTION, PROTEIN f J 
DESIGN, 2 CRYSTAL f 
STRUCTURE, COMPLEX j= j 


Compound 


PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 1 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B,D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A.B.D.E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


SEQ FOLD 
score 




99.05 










PMF 
score 
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Verify 
score 
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PDB annotation 


TROPOMYOSIN COHJBD- 
COIL ALPHA-HELICAL, 
CONTRACTILE PROTEIN 


CONTRACTILE PROTEIN 
TROPOMYOSIN COILBD- 
COIL ALPHA-HELICAL. 
CONTRACTILE PROTEIN 




CONTRACTILE PROTEIN 
TROPOMYOSIN COILED- 
COIL ALPHA-HELICAL, 
! CONTRACTILE PROTEIN 




COMPLEX (NUCLEOCAPSID 
PROTEIN/RNA) 
NUCLEOCAPSID PROTEIN, 
COMPLEX (NUCLEOCAPSID 
PROTEIN/RNA), 2 STEM- 
LOOP RNA~ ... 




TRANSFERASE MRNA f 
PROCESSING. ^ 
TRANSFERASE, V 
TRANSCRIPTION, RNA- fl 
BINDING, 2 *i 
PHOSPHORYLATION, U 
NUCLEAR PROTEIN, C 
ALTERNATIVE SPLICING 3 f[ 
HELICAL TURN MOTIF, \ 
NUCLEOTIDYL /* 
TRANSFERASE CATALYTIC £ 
DOMAIN ?! 


TRANSFERASE MRNA Ml 
PROCESSING, R 
TRANSFERASE, ft 






< 




<r 






O ^ « 


iaff 

2 CO < 












W 


Compound 


B.C.D 


TROPOMYOSIN; CI 
B, C,D 




TROPOMYOSIN; a 
B, C, D 






NUCLEOCAPSID PI 
CHAIN: A; SL3 STE] 
LOOP RNA; CHAIN: 


NUCLEOCAPSID PI 
HIV-1 NUCLEOCAF 
PROTEIN (MNISOI 
(NMR, 20 STRUCT! 


1AAF3 


ii 








POLY(A) POLYMH 
CHAIN: A; 


R 






























score 
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score 
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0.17 
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Verify 
score 
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PDB annotation 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC M 
FINGER/DNA) ZINC FINGER, 1 ■ 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA «f j 
INTERACTION, PROTEIN f| 
DESIGN, 2 CRYSTAL h 
STRUCTURE, COMPLEX f 1 
(ZINC FINGER/DNA) \ 


COMPLEX (ZINC m 
FINGER/DNA) ZINC FINGER, y i 
PROTEIN-DNA £| 
INTERACTION, PROTEIN n j 
DESIGN, 2 CRYSTAL V 
STRUCTURE, COMPLEX \ 
(ZINC FINGER/DNA) P 


JLBSS 


Compound 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C,F,G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEE*; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F. G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


TRANSCRIPTION 
REGULATION YEAST 
TRANSCRIPTION FACTOR 
ADRl (RESIDUES 130 - 159) 


SEQFOLD 
score 




116.59 










PMF 
score 


1.00 




1.00 


0.89 


0.16 


0.55 


Verify 
score 
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PDB annotation 




ZINC FINGER 


TRANSCRIPTION FACTOR 
SPl; ZINC FINGER, 
TRANSCRIPTION 
ACTIVATION, SPl 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) TFTHA; 
5S GENE; NMR, TFIIIA, 
PROTEIN, DNA, 
TRANSCRIPTION FACTOR, 
5S RNA 2 GENE, DNA 
BINDING PROTEIN, ZINC 
FINGER, COMPLEX 3 w j 
(TRANSCRIPTION «j 
REGULATION/DNA) * ! 


COMPLEX (TRANSCRIPTIONS 
REGULATION/DNA) H 
COMPLEX (TOANSOTffTIOI^l 
REGULATION/DNA), RNA U \ 
POLYMERASE HI, 2 Pj 
TRANSCRIPTION m 
INITIATION, ZINC FINGER l * 
PROTEIN J 1 


8 8 

ill is 

§ O ^ O h4 

8S8S2 


Compound 


IPAA 3 (PAPA - CARBOXY 
TERMINAL ZINC FINGER 
DOMAIN) MUTANT WITH 
IPAA 4 PRO 131 
REPLACED BY ALA, PRO 
133 REPLACED BY ALA, 
CYS 140 IPAA 5 
REPLACED BY ALA 
(P131A,P133A,C140A) 
(NMR, 10 STRUCTURES) 
IPAA 6 


SP1F2; CHAIN: NULL; 




TRANSCRIPTION FACTOR 
mA; CHAIN: A; 5S RNA 
GENE; CHAIN: E, F; 


TFIIIA; CHAIN: A, D; 5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C,E,F; 


TFHIA; CHAIN: A, D; 5S 
RIBOSOMAL RNA GENE; 


if 
J 
0 


SEQ FOLD 
score 








94.31 






score 




0.46 


0.36 




0.99 


1 


score 1 
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0.32 
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PDB annotation 




MUSCLE PROTEIN MDB; 
MUSCLE PROTEIN 


MUSCLE PROTEIN MDE; 
MUSCLE PROTEIN 


MUSCLE PROTEIN MUSCLE L 
PROTEIN A 


MUSCLE PROTEIN MUSCLE 1 
PROTEIN 


CONTRACTILE PROTEIN 
MYOSIN MOTOR, 
CONFORMATIONAL 
CHANGES 


CONTRACTILE PROTEIN 
MYOSIN, DICTYOSTELIUM, 
MOTOR, MANT, ATPASE, 
ACTIN-BINDING, 2 COILED 
COIL 


CONTRACTILE PROTEIN 
MYOSIN, DICTYOSTELIUM, 
MOTOR, MANT, ATPASE, 
ACTIN-BINDING, 2 COILED 
COIL 11 


CONTRACTILE PROTEIN f| 
ATPASE, MYOSIN, COILED i 
COIL, ACTIN-BINDING, ATP-,, 1 
BINDING, 2 HEPTAD Jj 
REPEAT PATTERN, *i 
METHYLATION, Ui 
ALKYLATION, 3 £| 
PHOSPHORYLATION, fjj 
CONTRACTILE PROTEIN 


o °. § 3 *? 

sMfe|8 

IS -is! 

u < u m 3 2 


Compound 


MYOSIN ESSENTIAL 
LIGHT CHAIN; CHAIN: Z 


MYOSIN; CHAIN: A,B,C, 
D, E, F, G, H; 


MYOSIN; CHAIN: A, B, C, 
D, E, F, G, H; 


MYOSIN; CHAIN: A, B, C, 
D, E, F; 


MYOSIN; CHAIN: A, B, C, 
D, E. F; 


MYOSIN HEAD; CHAIN: A; 
MYOSIN HEAD; CHAIN: Y; 
MYOSIN HEAD; CHAIN: Z; 


MYOSIN; CHAIN: NULL; 


MYOSIN; CHAIN: NULL; 


MYOSIN; CHAIN: NULL; 


MYOSIN; CHAIN: NULL; 


SEQFOLD 
score 






526.30 . 




489.18 




496.50 




425.46 




PMF 
score 




1.00 




1.00 




1.00 




1.00 




1.00 


Verify 
score 




0.61 




0.65 




0.33 




0.48 




0.40 

i 

! 


Psi 
Blast 




o 


© 


o 


© 


© 


o 


o 


© 


© 






o 

? 


© 


OIL 


OIL 


s 


OIL 


0IZ. 


ON 

en 

NO 


Ox 

<o 


START 
AA 




1—1 




*— « 








VO 




CO 


CHAIN 
ID 




< 


< 


< 


< 


< 














; Ibrl 


Ibrl 


1 




ldfk 


j llvk 

i 


llvk 


Imnd 


T3 

1 


SEQID 
NO: 




2 


2 


8 


2 


o 


s 

i— < 


S 


On 
O 
»— • 


s 
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PDB annotation 


ALKYLATION, 3 
PHOSPHORYLATION, 
CONTRACTILE PROTEIN c 


i MUSCLE PROTEIN MUSCLE 

| PROTEIN, MYOSIN 
SUBFRAGMENT- 1 , MYOSIN 
HEAD, 2 MOTOR PROTEIN 


MUSCLE PROTEIN MUSCLE | 
PROTEIN, MYOSIN T 
SUBFRAGMENT- 1, MYOSIN • 
HEAD, 2 MOTOR PROTEIN 




KINASE KINASE, SIGNAL 

TRANSDUCTION, 

CALCIUM/CALMODULIN 


KINASE KINASE, SIGNAL 

TRANSDUCTION, 

CALCIUM/CALMODULIN 


TRANSFERASE 
TRANSFERASE, 
SERINE/THREONINE- 
PROTEIN KINASE, CASEIN 
KINASE, 2 SER/THR KINASE 


IPC H^g=*-Qg^O:L5 




Compound 




MYOSIN; CHAIN: A, B, C; 


MYOSIN; CHAIN: A, B, C; 

• 




• 

1 

! 


DEPENDENT PROTEIN 
KINASE; CHAIN: NULL; 


CALCIUM/CALMODULIN. 
DEPENDENT PROTEIN 
KINASE: CHAIN: NULL; 


PROTEIN KINASE 
CK2/ALPHA-SUBUNIT; 
CHAIN: NULL; 


TRANSPERASE(PHOSPHO 
TRANSFERASE) $C-/AMP$- 


DEPENDENT PROTEIN 
KINASE (E.C.2.7.1.37) 
($C/APK$) 1APM3 
(CATALYTIC SUBUNTT) 
ALPHA ISOENZYME 
MUTANT WITH SER 139 
1 APM 4 REPLACED BY 
ALA (/S139AS) COMPLEX 
WITH THE PEPTIDE 1 APM 
5 INHIBITOR PKI(5-24) 
AND THE DETERGENT 
MEGA-8 1APM6 


TRANSFERASEOPHOSPHO 
TRANSFERASE) $C-/AMP$- 


SEQFOLD 
score 




392.48 






142.86 




110.78 




168.62 


PMF 
score 
















8 




Verify 
score 






oo 

2 






cn 
© 




d 








o 


O 




00 
1— 1 


L7e>89 


VO 

i 

cn 


o 


o 








CO ' 

r» 




00 

»— « 

CO 


© 
o 
m 


cn 


»— • 
cn 


VO 
CM 

cn 


START 
AA 






NO 




o 

f-* 


f-H 




r- 


cs 


CHAIN 
ID 




< 


< 










W 


W 


£Q 




! 


1 

<N 




VO 

o 

— < 


VO 


o 


i 


i 


SEQID 
NO: 




2 


S 




© 


o 


o 

»— 1 
<— t 


o 

»— t 
•—1 


o 
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PDB annotation 


CELL DIVISION, MITOSIS, 
PHOSPHORYLATION 


TRANSFERASE JNK3; 
TRANSFERASE, JNK3 MAP 
KINASE, 

SERINE/THREONINE 
PROTEIN 2 KINASE 


KINASE KINASE, TWITCHIN, J 
INTRASTERIC REGULATION f 


KIN ASE KINASE, TWITCHIN, 
INTRASTERIC REGULATION 


KINASE KINASE, TWITCHIN, 
INTRASTERIC REGULATION 


KINASE KINASE, TWITCHIN, 
INTRASTERIC REGULATION 


TRANSFERASE MITOGEN 
ACTIVATED PROTEIN 
KINASE; TRANSFERASE, 
MAP KINASE, 


PROTEIN KINASE, 2 P38 


KINASE RABBIT MUSCLE 
PHOSPHORYLASE KINASE; 
GLYCOGEN METABOLISM, 1) 
TRANSFERASE, fj 
SERINE/THREONINE- J 
PROTEIN, 2 KINASE, ATP- *. 1 
BINDING, CALMODULIN- fl j 
BINDING 1 


KINASE RABBIT MUSCLE 
PHOSPHORYLASE KINASE; %\ 
GLYCOGEN METABOLISM, fl| 
TRANSFERASE, V 
SERINE/THREONINE- f j 
PROTEIN, 2 KINASE, ATP- J 
BINDING, CALMODULIN- £ 
BINDING »1J 


SERINE KINASE SERINE f[ J 
KINASE, TITIN. MUSCLE, HI 


Compound 




C-JUNN-TERMINAL 
KINASE; CHAIN: NULL; 


TWITCHIN; CHAIN: NULL; 1 




TWITCHIN; CHAIN: A, B; 


« 

u 
u 


TWITCHIN; CHAIN: A, B; 


' MAP KINASE P38; CHAIN: 
NULL; 




PHOSPHORYLASE 
KINASE; CHAIN: NULL; 


PHOSPHORYLASE 
KINASE: CHAIN: NULL: 




< 

i 


SEQ FOLD 
score 




127.21 




139.53 






119.80 




170.32 


124.66 


PMF 
score 






1.00 




1.00 


1.00 




1.00 






Verify 
score 






0.48 




0.39 


0.36 




0.77 






Psi 
Blast 




oo 

\ 

cn 


cn 
A 
•—I 


CN 

% 

CN 
«— » 


1.2e-73 


1.2e-92 


cn 

NO 

h 

CN 


oo 

00 

6 

<n 


5.1e-88 


o 
so 

6 

CN 

*-4 






cn 


S 


cn 


5 


SO 

cn 
cn 


cn 
cn 


CN 
R 


S 

CN 


cn 
cn 


START 
AA 






so 


cn 




r- 




oo 


OS 


so 


CHAIN 
ID 








< 


< 


< 








< 






M 
C 


lkoa 


Ikob 


; Ikob 


1 Ikob 


oo 
cn 
Cu 


Iphk 


Iphk 




SEQ ID 

NO: 




2 


o 


o 


o 


o 


o 


i 

Oil 


110 


on 
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O 5 



o . 



1 

i 

! 







H § B . 

ouSo 



6 



1= 

* 8 



&2 



*3 



CO 



Si 



pa 
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n 
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X O . 
pqUU 
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ST 
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£ 2 
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3 



1= 



CO 



oo 
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PDB annotation 


EUKARYOTIC INITIATION 
FACTOR 4A; IF4A, 
HELICASE, DEAD-BOX 
PROTEIN 


CHAPERONE/STRUCTURAL 
PROTEIN CHAPERONE 
ADHESIN DONOR STRAND 
COMPLEMENTATION, 2 
CHAPERONE/STRUCTURAL 
PROTEIN 


CHAPERONE/STRUCTURAL 
PROTEIN CHAPERONE 
ADHESIN DONOR STRAND 
COMPLEMENTATION, 2 
CHAPERONE/STRUCTURAL 
PROTEIN 




HYDROLASE HYDROLASE, 
DEPHOSPHORYLATION 


HYDROLASE PTPIB; 
HYDROLASE, 
PHOSPHORYLATION, 
LIGAND, INHIBITOR 


HYDROLASE C2 DOMAIN, {• 
PHOSPHOTEDYUNOSITOL, 
PHOSPHOTASE, * 
HYDROLASE 


HYDROLASE PROTEIN- « 
TYROSINE PHOSPHATASE; I 
HYDROLASE, PROTEIN j 
TYROSINE PHOSPHATASE, f 
CATALYTIC DOMAIN, 2 
WPD LOOP, SH2 DOMAIN , 


HYDROLASE DUAL J 
SPECMaTY f 
PHOSPHATASE, MAP ! 
KINASE HYDROLASE j 


I HYDROLASE DUAL $ 


Compound 


FACTOR 4A; CHAIN: A, B; 


9 w 1 

ii 1 
iioi 


12 

|| 


PAPD-LKE CHAPERONE 
FIMC; CHAIN: A, C, E, O, I, 
K, M, O; MANNOSE- 


SPECIFIC ADHESIN FIMH; 
CHAIN: B f D, F, H, J, L, N, P; 




PROTEIN TYROSINE 
PHOSPHATASE IB; 
CHAIN: NULL; 


PROTEIN-TYROSINE 
PHOSPHATASE IB; 
CHAIN: A; 




PHOSPHOINOSITIDE 
PHOSPHOTASE PTEN; 
CHAIN: A; 


• 




PYSTl; CHAIN: NULL; 




I PYSTl; CHAIN: NULL; 


S 


score 1 
























score 1 




-0.20 


9 




0.03 


-0.09 


0.95 


0.01 


0.71 


I 0.48 


1 


score 




0.14 


0.29 




0.25 


© 


0.38 


0.18 


0.14 


3 
d 


I 


Blast 




oo . 

o 

1 

o 

CS 


o 

5 

Ov 




in 

CO 

t 


oo 

<? 


3.4e-20 


© 
oo 


o 
cs 
6 
*<* 
cs 


cs 
i— < 


END 1 






1347 

i 


1307 




CO 




S 

CO 


o 

CO 


JP 

0\ 

cs 


VO 

ON 
CS 


START 
AA 




1227 


1241 




00 






On 
oo 


Os 

vn 
«— i 


«0 
OO 
»— » 


CHAIN 1 


a 




PQ 


pa 






< 


< 








pa ^ 

Q » 
ft* 




Iqun 


Iqun 




>* 

1 


CO 
00 

o 




i 


Imkp ! 


1 Imkp 


SEQ ID 
NO: 




cs 

^■4 


3U 






*■* 




«— 1 




2 
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PDB annotation 


SPECIFICITY 
PHOSPHATASE. MAP 
KINASE HYDROLASE 


RECEPTOR Dl; RECEPTOR, 
PHOSPHATASE, SIGNAL 
TRANSDUCTION, 
ADHESION, 2 HYDROLASE 


HYDROLASE VHR; 
HYDROLASE, PROTEIN 
DUAI^SPECIFICITY 
PHOSPHATASE 


HYDROLASE VHR; 
HYDROLASE. PROTEIN 
DUAL-SPECIFICITY 
i PHOSPHATASE 


HYDROLASE Dl; 

HYDROLASE, SIGNAL 

TRANSDUCTION, 

RECEPTOR, 

GLYCOPROTEIN, 2 

PHOSPHORYLATION, 

SIGNAL 


HYDROLASE YOP51, YOP2B, 
PASTEURELLA X, PTP-ASE, ■« 
PROTEIN TYROSINE f 
PHOSPHATASE, * 
HYDROLASE J 


HYDROLASE YOP51, YOP2B, 
PASTEURELLA X, PTP-ASE. ( 
PROTEIN TYROSINE I 
PHOSPHATASE, f 
HYDROLASE f 


TYROSINE PHOSPHATASE y 
SYP, SHPTP-2; TYROSINE f 
PHOSPHATASE, INSULIN J 
SIGNALING. SH2 PROTEIN r 




HYDROLASBSUMO f 
HYDROLASE, UBIQUITIN- f 


Compound 




RECEPTOR PROTEIN 
TYROSINE PHOSPHATASE 
MU; CHAIN: A, B; 


HUMAN VHl-RELATED I 


DUAL-SPECIHCITY 
PHOSPHATASE CHAIN: A, 
B; 


HUMAN VHl-RELATED 
DUAL-SPECIFICITY 
PHOSPHATASE CHAIN: A, 

IB; 


RECEPTOR PROTEIN 
TYROSINE PHOSPHATASE 
ALPHA; CHAIN: A,B; 


1 YERSINIA PROTEIN 


TYROSINE 


rnusrtiA i ASii ; u wajun : 
NULL; 


YERSINIA PROTEIN 
TYROSINE 

PHOSPHATASE; CHAIN: 
NULL; 


< 




1 
s < 


SEQ FOLD 
score 






















PMF 
score 




0.09 


0.99 


0.40 

i 
i 


0.09 


0.00 


0.39 


CN 
f— i 

9 




1.00 


Verify 
score 




-0.10 


0.49 


0.38 


0.31 


-0.05 


-0.07 


0.17 




0.51 


Psi 
Blast 




1 

»— * 


g 

o 

oi 


oo 

NO 


VO 

1 

oo 
vd 


CN 

o 

00 

vd 


3 

<D 
VO 
CO 


oo 

I 

1-H 




vS 






00 

o\ 

CN 


vo 


O 

cn 


oo 


o\ 

00 
CN 


s 


oo 
0\ 
CN 




oo 
oo 
»o 


START 
AA 




OO 




s 

1—1 


a 


\n 
i— « 


1 


Os 




3 


CHAIN 
ID 




< 


< 


< 


< 






< 




< 


A* 




Irpm 


Ivhr 


Ivhr 


lyfo 


lytn 


lytn 






leuv 


SEQID 
NO: 




«— • 

—4 


•<* 


»— * 


+-4 


v— t 


«— » 


«-« 

»— 1 
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8 
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a I 
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m 
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5g 



§ 
i 








3« 

58 



s § 



5 

o 
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O 

— * 



oo 
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rs 



5 
u 



u 



03 
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PDB annotation 


COMPLEX ! 




STRUCTURAL PROTEIN 
RETINAL S-ANTIGEN, 48 KD 
PROTEIN; VISUAL 
ARRESTIN, 

DESENSITIS ATION OF THE 
VISUAL TRANSDUCTION 2 
CASCADE, BINDING TO 
ACTICATED AND 
PHOSPHORYLATED 
RHODOPSIN 


STRUCTURAL PROTEIN 
RETINAL S-ANTIGEN, 48 KD 
PROTEIN; VISUAL 
ARRESTIN, 

DESENSITIS ATION OF THE 
VISUAL TRANSDUCTION 2 
CASCADE, BINDING TO 
ACTICATED AND 
PHOSPHORYLATED 
RHODOPSIN 


STRUCTURAL PROTEIN 
RETINAL S-ANTIGEN, 48 KD * 
PROTEIN; VISUAL f 
ARRESTIN, 

DESENSITISATIONOFTHE * 
VISUAL TRANSDUCTION 2 , 
CASCADE, BINDING TO 
ACTICATED AND S 
PHOSPHORYLATED ( 
RHODOPSIN f 




PROTEASE PROSOME, # 
MULTICATALYTIC \ 
PROTEASE, MCP, 
MACROPAIN; PROTEASE, 1 
PROTEASOME. HYDROLASE f 


I MULTICATALYTIC f\ 


Compound 






ARRESTIN; CHAIN: A, B, C, 
D; 


U 
< 


a 


ARRESTIN; CHAIN: A, B, C, 


a 




PROTEASOME; CHAIN: A, 
B.C,D,E,F,G,H,I,J,K,L, 


I 20S PROTEASOME; 


SEQFOLD 
score 






73.18 




71.95 




71.75 


I 55.61 


PMF. 
score 








0.00 










Verify 
score 








m 
m 

9 










Psi 
Blast 






i ' 


l-H 


I 




i 


co 

I 








00 

m 




cn 




8 

CM 




START 
AA 








53. 










CHAIN 
ID 






< 


< 


a 














Icfl 

1 
i 


Icfl 


Icfl 




1 




SEQID 
NO: 






s 


2 


i—t 
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Kt 






Ui. 




PDB annotation 


HALOPEROXIDASE 
BROMOPEROXIDASE L, 
HALOPEROXIDASE L; 
HALOPEROXIDASE, 
OXIDOREDUCTASE 


HALOPEROXIDASE 
CHLOROPEROXIDASE Al, 
HALOPEROXIDASE Al; I 
HALOPEROXIDASE, 
OXIDOREDUCTASE 


HALOPEROXIDASE 
HALOPEROXIDASE F; 
HALOPEROXIDASE, 
OXIDOREDUCTASE, 
PROPIONATE COMPLEX 


AMINOPEPTDDASE 
AMINOPEPTIDASE, 
PROLINE IMINOPEPTIDASE, 
SERINE PROTEASE, 2 
XANTHOMONAS 
CAMPESTRIS 


HYDROLASE HYDROLASE, 
HALOALKANE 
DEHALOGENASE, 
ALPHA/BETA-HYDROLASE 


HALOPEROXIDASE 
HALOPEROXIDASE A2, 
CHLOROPEROXIDASE A2; 
HALOPEROXIDASE, 


< 

P 

O p. 


2 HYDROLASE FOLD. 
MUTANT M99T 


HYDROLASE BPHD; 
HYDROLASE, PCB 
DEORADATION 


HYDROLASE A/B 
HYDROLASE FOLD, 
DEHALOGENASE I-S BOND 


Compound 


CHLOROPEROXIDASE L; 
CHAIN: A,B,C; 


BROMOPEROXIDASE Al; 
CHAIN: NULL; 


CHLOROPEROXIDASE F; 
CHAIN: NULL; 


PROLINE 

IMINOPEPTIDASE; CHAIN: 
A, B; 


s 


DEHALOGENASE; CHAIN: 
NULL; 


BROMOPEROXIDASE A2; 
CHAIN: NULL; 


lis 


HALOALKANE 
DEHALOGENASE; 1- 
CHLOROHEXANE CHAIN: 


SEQ FOLD 
score 


68.67 


54.47 


50.66 


56.93 


71.01 


68.50 


63.30 


73.30 


PMF 
score 


















Verify 
score 
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O 


! 

t— i 


? 
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*n 


cn 


00 

cn 




cn 

oo 
vd 


a< 


VO 

VO 

cn 


s 

cn 


VO 
VO 

cn 


55 

m 


cn 
ir> 
cn 


»r> 

VO 

cn 


« 
cn 


oo 
vo 
cn 


START 
AA 








ON 

m 


i— « 

m 


CN 




»o 
vo 


CHAIN 
ID 


< 








< 












< 




< 


la 


la88 


cr 

00 
CO 


CO 
00 

cd 

r- 1 


i 

i— < 


oo 
VO 


lbrt 




lcqw 


SEQID 
NO: 
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8 
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PDB annotation 




PROTEIN KINASE CDK2; 
PROTEIN KINASE, CELL 
CYCLE, 

PHOSPHORYLATION, 
STAUROSPORINE, 2 CELL 
DIVISION, MITOSIS, 
INHIBITION 1 


COMPLEX 1 
(KINASE/INHIBITOR) CDK6; 
P19INK4D; CYCUN 
DEPENDENT KINASE, 
CYCUN DEPENDENT 
KINASE INHIBITORY 2 
PROTEIN,-CDK, INK4, CELL 
CYCLE, COMPLEX 
(KINASE/INHIBITOR) 
HEADER HELIX 


COMPLEX (INHIBITOR 
PROTEIN/KINASE) 
INHIBITOR PROTEIN, 
CYCLIN-DEPENDENT 
KINASE, CELL CYCLE 2 
CONTROL, ALPHA/BETA, «f i 
COMPLEX (INHIBITOR f j 
PROTEIN/KINASE) * ! 




r^OJLBEB 


Compound 


MEGA-8 1APM 6 | 


CYCLIN-DEPENDENT 
PROTEIN KINASE 2; 
CHAIN: NULL; 


CYCLIN-DEPENDENT 
KINASE 6; CHAIN: A, C; 
CYCLIN-DEPENDENT 
KINASE INHIBITOR; 
CHAIN: B,D; 


i CYCLIN-DEPENDENT 
KINASE 6; CHAIN: A; 
P19INK4D; CHAIN: B; 


PHOSPHOTRANSFERASE . 
CAMP-DEPENDENT . 
PROTEIN KINASE 
CATALYTIC SUBUNTT 
1CMK 3 (E.C.2.7.1.37) I 
ICMK 4 


TRANSFERASE(PHOSPHO 
TRANSFERASE) CAMP- 
DEPENDENT PROTEIN 
KINASE (E.C.2.7.1.37) 
(CAPK) ICTP 3 
(CATALYTIC SUBUNTT) 
ICTP 4 


SEQFOLD 
score 




160.34 . 


148.04 


163.82 


116.16 


112.97 


PMF 
score 














Verify 
score 
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o 
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cn 
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§ 
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00 
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K 
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SEQID 
NO: 
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PDB annotation 


PROTEIN KINASE, 
TRANSFERASE 


SERINE KINASE SERINE 
KINASE, TITIN. MUSCLE, 
AUTOINHIBmON 


TRANSFERASE MITOGEN 
ACTIVATED PROTEIN 
KINASE. MAP 2, ERK2; j 
TRANSFERASE, 
SERINE/THREONINE- 


PROTEIN KINASE, MAP ! 


i KINASE, 2 ERK2 1 




E 


PROTEIN/RNA) COMPLEX 
(NUCLEAR PROTEIN/RNA), 
RNA, 

SNRNPJUB ONUCLEOPROTE 
IN 


COMPLEX (NUCLEAR 
PROTEIN/RNA) COMPLEX 
(NUCLEAR PROTEIN/RNA), 
RNA, 

SNRNPJfUBONUCLEOPROTE 
IN . .if 




1 COMPLEX (ZINC 


FINGER/DNA) COMPLEX 1 
(ZINC FINGER/DNA), ZINC Jl 
FINGER, DNA-BINDING W 
PROTEIN ft 


COMPLEX (ZINC * 
FINGER/DNA) ZINC FINGER, * 
PROTElN-DNA J 
INTERACTION, PROTEIN \ 
DESIGN, 2 CRYSTAL f 
STRUCTURE, COMPLEX f 
(ZINC FINGER/DNA) f 


Compound 




TITIN; CHAIN: A..B; 


EXTRACELLULAR ' 
REGULATED KINASE 2; 
CHAIN: NULL; 




| 
< 


CHAIN:Q,R;U2A'; 
CHAIN: A,C;U2B n ; 
CHAIN: B,D; 


U2 RNA HAIRPIN IV; 
CHAIN: Q,R;U2A'; 
CHAIN: A, C; U2 B"; 
CHAIN: B,D; 




QGSR 23NC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 
C; 


.Jo 
n|| 

p - 


SEQFOLD 
score 




127.01 


165.77 
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PDB annotation 


PROTEIN KINASE CDK2; 
PROTEIN KINASE, CELL 


CYCLE, 

PHOSPHORYLATION, 
STAUROSPORINE, 2 CELL 
DIVISION, MITOSIS, 
INHIBmON 


COMPLEX !| 
(KINASE/INHIBITOR) CDK6; y 
P19INK4D; CYCLIN 
DEPENDENT KINASE, 
CYCLIN DEPENDENT 


KINASE INHIBITORY 2 
I PROTEIN, CDK, INK4, CELL 
CYCLE, COMPLEX 
(KINASE/INHIBITOR) ' 
HEADER HELIX 


COMPLEX (INHIBITOR 
PROTEIN/KINASE) 
INHIBITOR PROTEIN, 
CYCLIN-DEPENDENT 
KINASE, CELL CYCLE 2 
CONTROL, ALPHA/BETA, 
COMPLEX (INHIBITOR ^ 
PROTEIN/KINASE) fi \ 






O -fc. wife 
1 


PROTEIN KINASE CDK2; ml 


Compound 


CYCLIN-DEPENDENT 
PROTEIN KINASE 2; 


CHAIN: NULL; 


CYCLIN-DEPENDENT 
KINASE 6; CHAIN: A. C; 
CYCLIN-DEPENDENT 
KINASE INHIBITOR; 
CHAIN: B,D; 




CYCLIN-DEPENDENT 
KINASE 6; CHAIN: A; 
P19INK4D; CHAIN: B; 


PHOSPHOTRANSFERASE 
CAMP-DEPENDENT 
PROTEIN KINASE 
CATALYTIC SUBUNTT 
1CMK3(E.C.2.7.1.37) 
1CMK4 


TRANSFERASE(PHOSPHO 
TRANSFERASE) CAMP- 
DEPENDENT PROTEIN 
KINASE (E.C.2.7.1.37) 
(CAPK)1CTP3 
(CATALYTIC SUBUNTT) 
ICTP 4 


I HUMAN CYCLIN- 


SEQFOLD 
score 


110.89 


ii2;oo 


135.09 


252.68 


244.28 


123.56 


PMF 
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PDB annotation 


• 


COMPLEX (OTP- 
BINDINO/TRANSDUCER) 
BETAl, TRANSDUCIN BETA 


TRANSDUCIN GAMMA 
SUBUNTT; COMPLEX (GTP- 
BINDING/TRANSDUCER), G 1 
j PROTEIN, HETEROTRIMER 2 f 
! SIGNAL TRANSDUCTION 




COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
GABPALPHA; GABPBETAl; 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), DNA- 
BINDING, 2 NUCLEAR 
PROTEIN, ETS DOMAIN, 
ANKYRIN REPEATS, 
TRANSCRIPTION 3 FACTOR 


TRANSCRIPTION FACTOR 


8^ 

| 

10 P si 

cu h a 

828 


COMPLEX (TRANSCRIPTION K 
REG/ANK REPEAT) 
COMPLEX (TRANSCRIPTION 
REGULATION/ANK G 
REPEAT), ANKYRIN 2 t 
REPEAT HELIX f 




OXIDOREDUCTASE 
TRYPANOTHIONE f 
REDUCTASE, FAD [ 
DEPENDENT DISULPHIDE 2 * 
OXIDOREDUCTASE T 


1 


| 


Compound 




GT-ALPHA/GI-ALPHA 
CHIMERA; CHAIN: A; GT- 
BETA; CHAIN: B; GT- 


) 




GA BINDING PROTEIN 
ALPHA; CHAIN: A; GA 
BINDING PROTEIN BETA 
I; CHAIN: B; DNA; CHAIN: 
D,E; 


«n 

I 
* 


tf . 

7 ♦~ / P-C {—4 

□ co E 7. 

|||| 


NF-KAFPA-B P65; CHAIN: 
A,C;NF-KAPPA-BP50; 
CHAIN: B, D; I-KAPPA-B- 
ALPHA; CHAIN: E,F; 




TRYPANOTHIONE 
REDUCTASE; CHAIN: A, B; 


DIHYDROUPOAMIDE 
DEHYDROGENASE; 


SEQPOLD 
score 




66.17 
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83.26 


91.60 




78.99 
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PDB annotation 








KINASE KINASE, SIGNAL 
TRANSDUCTION, j 
CALCIUM/CALMODULIN 1 




COMPLEX " 
(KINASE/INHIBITOR) CDK6; f ! 
P19INK4D;CYCUN S 
DEPENDENT KINASE, * ' 
C YCLIN DEPENDENT ' J 
KINASE INHIBITORY 2 H 
PROTEIN, CDK, INK4, CELL ( 
CYCLE. COMPLEX { 
(KINASE/INHIBITOR) f 
HEADER HELIX I 


-teg jLgn-u^k: 


Compound 




OXIDOREDUCTASE 
DIHYDROLIPOAMIDE 


DEHYDROGENASE 
(BJCl.UA) 3LAD3 




| 

u 


DEPENDENT PROTEIN 
KINASE; CHAIN: NULL; 


TRANSFERASE(PHOSPHO 
TRANSFERASE) $C-/AMP$- 
DEPENDENT PROTEIN 
KINASE (E.C.2.7.L37) 
($C/APK$) 1APM3 
(CATALYTIC SUBUNTT) 
ALPHA ISOENZYME 
MUTANT WITH SER 139 
1APM 4 REPLACED BY 
ALA (/S 139A$) COMPLEX 
WITH THE PEPTIDE 1 APM 
5 INHIBITOR PKI(5-24) 
AND THE DETERGENT 
MEGA-8 1APM6 


CYCLJN-DEPENDENT 
KINASE 6; CHAIN: A, C; 
CYCUN-DEPENDENT 
KINASE INHIBITOR; 
CHAIN: B t D; 


PHOSPHOTRANSFERASE 
CAMP-DEPENDENT 


PROTEIN KINAorS 
CATALYTIC SUBUNTT 
1CMK3(E.C.2.7.1.37) 
1CMK4 


SEQFOLD 
score 




114.03 




106.55 


104.03 


72.81 


100.79 


PMF 
score 
















Verify 
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PDB annotation 




PROTEIN KINASE CDK2; 
TRANSFERASE. 1 


SERINE/THREONINE 
PROTEIN KINASE, ATP- 
BINDING, 2 CELL CYCLE, 
CELL DIVISION, MITOSIS, 
PHOSPHORYLATION 


KINASE KINASE, TWTTCHIN, 
INTRASTERIC REGULATION 


KINASE RABBIT MUSCLE 
PHOSPHORYLASE KINASE; ! 
GLYCOGEN METABOUSM, 
TRANSFERASE, 
SERINE/THREONINE- 
PROTEIN, 2 KINASB, ATP- 
BIND1NG, CALMODULIN- 
BINDING 1 


TRANSFERASE MAP fj 
KINASB, 

SERINE/THREONINE T 
PROTEIN KINASE, • 
TRANSFERASE M 


1 3 

2 Co* 

w 5 i 

ill 

w W < 




4 

c/ 


PROTEIN COLICIN, £ 
BACTERIOCIN,ION Jj 
CHANNEL FORMATION, 1 
TRANSMEMBRANE 2 f 
PROTEIN _ p 


Compound 


TRANSFERASE(PHOSPHO 
TRANSFERASE) CAMP- 
DEPENDENT PROTEIN 
KINASE (E.C.2.7.1.37) 
(CAPK)1CTP3 
(CATALYTIC SUBUNIT) 
ICTP 4 • 


HUMAN CYCUN- 
DEPENDENT KINASE 2: 


CHAIN: NULL; 


» 


PHOSPHORYLASE 
KINASE; CHAIN: NULL; 






THIN; CHAIN: A, B; 




< 

E 


NULL; 




SEQFOLD 
score 


100.24 


78.81 

j 


104.78 


110.63 
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PDB annotation 


SUBUNTT; GAMMA 1, 
TRANSDUCIN GAMMA 
SUBUNTT; COMPLEX (GTP- 
BINDING/TRANSDUCER), G 
PROTEIN, HETEROTRIMER 2 
SIGNAL TRANSDUCTION 


COMPLEX (GTP- 1 
BINDING/TRANSDUCER) Jj 
BETA1, TRANSDUCIN BETA \ 
SUBUNTT; GAMMA1, 
TRANSDUCIN GAMMA 
SUBUNTT; COMPLEX (GTP- 
i BINDING/TRANSDUCER). G 
j PROTEIN, HETEROTRIMER 2 
1 SIGNAL TRANSDUCTION 


COMPLEX (GTP- 
BINDING/TRANSDUCER) 
BETA1, TRANSDUCIN BETA 
SUBUNTT; GAMMA 1, 
TRANSDUCIN GAMMA 
SUBUNTT; COMPLEX (GTP- 
BINDING/TRANSDUCER), G 
PROTEIN, HETEROTRIMER 2 
SIGNAL TRANSDUCTION 1 1 


OXIDOREDUCTASE f j 
ENZYME, NITRITE * 
REDUCTASE, f 1 
OXIDOREDUCTASE, 
DENITRIHCATION, 2 4 
ELECTRON TRANSPORT, U| 
PERIPLASMIC " ffc 


tie 


g 1 5 O co o 

8|S«8|8 
§ §S 5? 3 § S § 


Compound 


GAMMA; CHAIN: G; 


GT-ALPHA/GI-ALPHA 
CHIMERA; CHAIN: A; GT- 
BETA; CHAIN: B;GT- 
GAMMA; CHAIN: G; 


GT-ALPHA/GI-ALPHA 
CHIMERA; CHAIN: A; GT- 
BETA; CHAIN: B; GT- 1 
GAMMA; CHAIN: G; 


CYTOCHROME CDl 


NITRITE REDUCTASE; 
CHAIN: A, B; 




23S RRNA; CHAIN: 0; 5S 
RRNA; CHAIN: 9; 
RIBOSOMAL PROTEtN L2; 
CHAIN: A; RIBOSOMAL 
PROTEIN L3; CHAIN: B; 
RIBOSOMAL PROTEIN L4; 
CHAIN: C; RIBOSOMAL; 


q 
















SEQFOI 
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PDB annotation 


HL33; SOS RIBOSOMAL 
PROTEIN L30P, HMAL30, 
HL20,HL16;50S 


L34, HL30; SOS RIBOSOMAL 
PROTEIN L32E.HL5; SOS 
RIBOSOMAL PROTEIN L37E. 
L35E; SOS RIBOSOMAL | 
PROTEINS L39E, HL39E, } I 
HL46E; 50S RIBOSOMAL 
PROTEIN L44E, LA, HLA; 50S 
RIBOSOMAL PROTEIN L6P, 
HMAL6. HL10 RIBOSOME 


ASSEMBLY, RNA-RNA, 
PROTEIN-RNA, PROTEIN- 
PROTEIN 


w 


SERINE PROTEASE PCPA2; f\ 
SERINE PROTEASE, , 
ZYMOGEN, HYDROLASE ; 1 


SERINE PROTEASE 1 
PORCINE m 
PROCARBOXYPEPTIDASE, W 
SERINE PROTEASE ft 


K/Q, 
/ 


t 

• 


DNA BINDING PROTEIN HI 
CENTROMERE PROTEIN, fffl 
DNA-BINDING. HELIX- f$ 


Compound 


L23; CHAIN: P; 
RIBOSOMAL PROTEIN 


L24; CHAIN: Q; 
RIBOSOMAL PROTEIN 


L24E; CHAIN: R; 
RIBOSOMAL PROTEIN 
L29; CHAIN: S; 
RIBOSOMAL PROTEIN 
L30; CHAIN: T; 
RIBOSOMAL PROTEIN 
L31E; CHAIN: U; 
RIBOSOMAL PROTEIN 
L32E; CHAIN: V; 
RIBOSOMAL PROTEIN 
L37AE; CHAIN: W; 

i L37E; CHAIN: X; 

J RIBOSOMAL PROTEIN 

L39E; CHAIN: Y; 
j RIBOSOMAL PROTEIN 
! L44E; CHAIN: Z; 

RIBOSOMAL PROTEIN L6; 

CHAIN: 1; 




PROCARBOXYPEPTIDASE 
A2; CHAIN: NULL; 


PROCARBOXYPEPTIDASE 
B; CHAIN: NULL 




HYDROLASE(C- 
TERM3NAL PEPTIDASE) 
PROCARBOXYPEPTIDASE 
A(E.C.3.4.12.2)1PCA3 




CENTROMERE PROTEIN 
B; CHAIN: A; 


SEQ FOLD 
score 
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114.78 






A- 


score 














CO 
00 

d 




score j 














oo 
O 

d 


Psi 
Blast 








00 

oo 

t 

« 


o 

CO 




« 


S 






00 

tn 


On 

»o 






NO 


START 
AA 






o\ 


CO 

cs 

1— « 


B 




VO 


CHAIN 1 


a 
















Is 








CO 

S 


<a 
u 
a 




1 


9 


i 






oo 
VO 
CN 


00 
NO 
CM 


00 

8 




OS 

NO 

CN 



355 



WO 02/081731 



PCT/US02/O1222 




356 



WO 02/081731 



PCT/US02/01222 




357 



WO 02/081731 



PCT/US02/01222 




358 



WO 02/081731 



PCT/US02/01222 




359 



WO 02/081731 



PCT/US02/01222 




360 



WO 02/081731 



PCT/US02/01222 



PDB annotation 


KARYOPHERIN BETA, P95 
SMALL GTPASE, NUCLEAR 
TRANSPORT RECEPTOR 


NUCLEAR TRANSPORT 
PROTEIN COMPLEX HEAT 
REPEATS, NUCLEAR 
TRANSPORT PROTEIN 
COMPLEX 


TRANSPORT RECEPTOR 1 
KARYOPHERIN BETA-1, 
NUCLEAR FACTOR P97, 
IMPORTIN IMPORTIN 
ALPHA-2 SUBUNTT, 
KARYOPHERIN ALPHA-2 
TRANSPORT RECEPTOR, 
NUCLEAR IMPORT, HEAT 
MOTIF, NLS-BINDING 


TRANSPORT RECEPTOR 
KARYOPHERIN BETA-1, 
NUCLEAR FACTOR P97, 
IMPORTIN IMPORTIN 
ALPHA-2 SUBUNTT, 
KARYOPHERIN ALPHA-2 
TRANSPORT RECEPTOR, T 
NUCLEAR IMPORT, HEAT f 
MOTIF, NLS-BINDING 


STRUCTURAL PROTEIN J 
ARMADILLO REPEAT, 1 
BETA-CATENIN, *\ 
STRUCTURAL PROTEIN M 


□ 


SCAFFOLD PROTEIN H 
SCAFFOLD PROTEIN, PP2A, V 
PHOSPHORYLATION, HEAT g * 
REPEAT r! 


SCAFFOLD PROTEIN £j 
SCAFFOLD PROTEIN, PP2A, H 1 
PHOSPHORYLATION, HEAT R I 


Compound 


IMPORTIN BETA 
SUBUNTT; CHAIN: B, D 


KARYOPHERIN BBTA2; 
CHAIN: B; RAN; CHAIN: C; 


IMPORTIN BETA 
SUBUNIT;"CHAIN: A; 
IMPORTIN ALPHA-2 
SUBUNTT; CHAIN: B; 


IMPORTIN BETA 
SUBUNTT; CHAIN: A; 
IMPORTIN ALPHA-2 
SUBUNTT; CHAIN: B; 


BETA-CATENIN; CHAIN: 
NULL; 




PROTEIN PHOSPHATASE 
PP2A; CHAIN: A,B; 


PROTEIN PHOSPHATASE 
PP2A;CHAIN:A,B; 


SEQFOLD 
score 




116.75 




127.17 






125.57 
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PDB annotation 


PROTEIN-DNA 
INTERACTION, PROTEIN 

DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
I (ZINC FINGER/DNA) 


COMPLEX (ZINC i 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA i 
INTERACTION, PROTEIN { 1 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC . 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL %>■ 
STRUCTURE, COMPLEX f 1 
(ZINC FINGER/DNA) * ' 


COMPLEX (ZINC n 
FINGER/DNA) ZINC FINGER, j 
PROTEIN-DNA I 
INTERACTION, PROTEIN { 
DESIGN, 2 CRYSTAL fj 
STRUCTURE, COMPLEX p i 
(ZINC FINGER/DNA) V 


COMPLEX (ZINC f l 
FINGER/DNA) ZINC FINGER, 5 * 
PROTEIN-DNA M 
INTERACTION, PROTEIN f\ 
DESIGN, 2 CRYSTAL fjj 
STRUCTURE. COMPLEX * j 


Compound 


PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZDSfC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B,D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 

i 


DNA; CHAIN: A, B, D, E; 
! CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C,F,G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


SEQFOLD 
score 














PMF 
score 




0.89 


1.00 


1.00 


LOO . 


1.00 


Verify 
score 




0.19 


0.29 


0.30 


0.21 


0.37 
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PDB annotation 


3 

i 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) _j 


COMPLEX (ZINC '1 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


i COMPLEX (ZINC 

i FINGER/DNA) ZINC FINGER, 

PROTEIN-DNA 

INTERACTION, PROTEIN 

DESIGN, 2 CRYSTAL 

STRUCTURE, COMPLEX 

(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, *\ ] 
PROTEIN-DNA f] 
INTERACTION, PROTEIN j 
DESIGN, 2 CRYSTAL 1 
STRUCTURE, COMPLEX . ; 
(ZINC FINGER/DNA) 4 


COMPLEX (ZINC IS; 
FINGER/DNA) ZINC FINGER, £ ; 
PROTEIN-DNA f!J 
INTERACTION, PROTEIN v 
DESIGN, 2 CRYSTAL f \ 
STRUCTURE, COMPLEX J* 
(ZINC FINGER/DNA) r b 


COMPLEX (ZINC PA 
FINGER/DNA) ZINC FINGER, fj | 
PROTEIN-DNA hi 


Compound 




DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 

• 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C f F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 

• 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC IflNGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN; C, F, G; 


SEQ FOLD 
score 
















PMF 
score 




LOO 


1.00 


8 

f-H 


1.00 


1.00 


1.00 


Verify 
score 




0.48 
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PDB annotation 


INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 

INTERACTION, PROTEIN | 
DESIGN, 2 CRYSTAL 1 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) I 


COMPLEX (ZINC j 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 1 ) 
(ZINC FINGER/DNA) fj 


COMPLEX (ZINC " j 
FINGER/DNA) ZINC FINGER, 1 
PROTEIN-DNA 

INTERACTION, PROTEIN m 
DESIGN, 2 CRYSTAL « ! 
STRUCTURE, COMPLEX g | 
(ZINC FINGER/DNA) ft I 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, - J 
PROTEIN-DNA 

INTERACTION, PROTEIN ¥ - 
DESIGN, 2 CRYSTAL R} 
STRUCTURE, COMPLEX fi j 
(ZINC FINGBR/DNA) p 


Compound 




DNA; CHAIN:'A,B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A,B,D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A,B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


SEQ FOLD 
score 




100.48 . 










PMF 
score 






1.00 


| 1.00 


1.00 


0.09 


Verify 
score 






0.43 
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PDB annotation 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (TRANSCRIPTION | 
REGULATION/DNA) TFIIIA; | 
5S GENE; NMR, TFIIIA, 
PROTEIN, DNA, 
| TRANSCRIPTION FACTOR, 
5S RNA2GENB, DNA 
BINDING PROTEIN, ZINC 
FINGER, COMPLEX 3 
(TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) TFIIIA; 
5S GENE; NMR, TFIIIA, 
PROTEIN, DNA, 
TRANSCRIPTION FACTOR, 1 
5S RNA 2 GENE, DNA 
BINDING PROTEIN, ZINC «j j 
FINGER, COMPLEX 3 A 
(TRANSCRIPTION * ' 
REGULATION/DNA) ™ 1 


COMPLEX (TRANSCRIPTION > 
REGULATION/DNA) fj 
COMPLEX (TRANSCRIPTION^ J 
REGULATION/DNA), RNA fj 
POLYMERASE IB, 2 pj 
TRANSCRIPTION \* 
INITIATION, ZINC FINGER J 
PROTEIN U 


COMPLEX (TRANSCRIPTION^ * 
REGULATION/DNA) f J 
COMPLEX (TRANSCRIPTION^ i 
REGULATION/DNA), RNA I) 


Cotnponnd 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


TRANSCRIPTION FACTOR 
IIIA; CHAIN: A; 5S RNA 
GENE; CHAIN: E,F; 

i 
i 


1 TRANSCRIPTION FACTOR 
HIA; CHAIN: A; 5S RNA 
GENE; CHAIN: E,F; 


TFIIIA; CHAIN: A, D; 5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C,E,F; 


TFUIA; CHAIN: A. D;5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C t E,F; 


SEQ FOLD 
score 












PMF 
score 
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0.00 
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Verify 
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PDB annotation 




COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER,. 
PROTEIN-DNA 

INTERACTION, PROTEIN i 
DESIGN, 2 CRYSTAL 1 
iSTRUCTURE, COMPLEX I 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE. COMPLEX ■ 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 1 j 
(ZINC FINGER/DNA) P 


COMPLEX (ZINC " i 
FINGER/DNA) ZINC FINGER, » 1 
PROTEIN-DNA J; 
INTERACTION, PROTEIN 'J 
DESIGN, 2 CRYSTAL Gl 
STRUCTURE, COMPLEX 0 
(ZINC FINGER/DNA) ■ ffi 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, f j 
PROTEIN-DNA £ 
INTERACTION, PROTEIN H 
DESIGN, 2 CRYSTAL R f 
STRUCTURE, COMPLEX f\\ 
(ZINC FINGER/DNA) p 


Compound 


MUTANT WITH CYS 1 1 
IBBO 3 REPLACED BY 
ABU (CI 1 ABU) (NMR, 60 


STRUCTURES) IBBO 4 


DNA; CHAIN: A,B,D, E; 
CONSENSUS 23NC FINGER 
PROTEIN; CHAIN: C. F, G; 


DNA; CHAIN: A, B,D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 




DNA; CHAIN: A, B.D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B.D.B; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B.D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C. F, G; 




SEQFOLD 
score 












• 


PMF 
score 
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0.95 


1.00 
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Verify 
score 
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PDB annotation 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION. PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 1 


! COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION. PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE. COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION. PROTEIN 
DESIGN. 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA ^ 
INTERACTION, PROTEIN f) 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX *A . 
(ZINC FINGER/DNA) Ji 


1 hh 

Iiiiiii 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER|U 
PROTEIN-DNA FU 
INTERACTION, PROTEIN flf 


Compound 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C. F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; . 


1 DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A,B,D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A,B,D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


SEQFOLD 
score 








103.66 






PMF 
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0.89 
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1.00 




1.00 


1.00 


Verify 
score 


0.23 


0.23 


0.24 




0.10 


0.48 


Psi 
Blast 


5.1e-39 


! 

l-H 


? 

1— « 


! 

OO 
SO 


! 

OO 
VD 


1 

cn 




cn 
cn 
CN 


r-C 
VO 
<N 




o 

a 


r- 

l-H 

cn 


5S 
cn 


START 
AA 


r- 
cn 
i— • 


g 


1 


1 


VO 




CHAIN 
ID 


U 


a 


6 


U 






SQ 


! 


i 


Imey 


lraey 


! 


f 


SEQID 
NO: 


OO 
CM 




r» 

OO 
CN 


ss 

<N 


OO 
CN 


OO 
CN 



382 



WO 02/081731 



PCTAJS02/01222 



e 

i 
1 





m 

1 5 pa 
1 E a< 



Mi sqs/ o jl eaa 



I: 




8 



•73 

1 

a 

o 
U 



i 



S 8 



«J CB 



o2 

IB Z 
us 





8 




S 



8 



r 



CM 





§8g 



s 



s 
© 



s 



CO 



u 



I 



8 



S 



383 



WO 02/081731 



PCT/US02/01222 




384 



WO 02/081731 



PCT/US02/01222 



8 






PfdjgrMisoe. 



w < w <: ; 

8S8S2I 





o 

I 

5 



co 1 






od 
o 



§1 

£ 8 



NO 

O 



£ s 

w 5 

^ CO 



I* 

CO 



a.. 



oo 

CO 



5 



a. 



385 



WO 02/081731 



PCT/US02/01222 



PDB annotation 


INITIATION, ZINC FINGER 
PROTEIN 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA . 
POLYMERASE m f 2 A 
TRANSCRIPTION ^ 
INITIATION, ZINC FINGER 
PROTEIN 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE m,2 


i 

o 

ill 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE EI. 2 
TRANSCRIPTION Yj 
INITIATION, ZINC FINGER R 
PROTEIN * ! 


a .5 a< goes 
8S8£2p§£5 


COMPLEX (TRANSCRIPTION* 
REGULATION/DNA) P 4 
COMPLEX (TRANSCRIPTIOJ1J 
REGULATION/DNA), RNA fy 
POLYMERASE m, 2 jpj 


Compound 




TFHIA; CHAIN: A,D;5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C,E,F; 


TFIIIA; CHAIN: A, D; 5S 
RIBOSOMAL RNA GENE; 
CHAIN: B, C, E, F; 


i TFmA;CHAIN:A l D;5S 
RIBOSOMAL RNA GENE; 
CHAIN: B, C,E,F; 


TFHIA;CHAIN:A,D;5S 
RIBOSOMAL RNA GENE; 
CHAIN: B, C, E, F; 


TFIHA; CHAIN:A,D;5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C,E,F; 


SEQFOLD 
score 














PMF 
score 




0.96 


1.00 


0.90 


0.57 1 
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PDB annotation 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GLI; GLI, ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DN A) 




IMMUNOGLOBULIN H 
IMMUNOGLOBULIN, KAPPA 
LIGHT-CHAIN D3MER 
HEADER 


COMPLEX 

(ANTIBODY/ANTIGEN) FAB- . 
12; VEGF; COMPLEX 
(ANTIBODY/ANTIGEN), j 
ANGIOGENIC FACTOR 


COMPLEX (HUMANIZED 

ANTIBODY/HYDROLASE) 

MURAMIDASE: 


HUMANIZED ANTIBODY, 
ANTIBODY COMPLEX, FV, 
ANTI-LYSOZYME, 2 
COMPLEX (HUMANIZED 
ANTIBODY/HYDROLASE) 


IMMUNE SYSTEM REIV, V 
STABILIZED 

IMMUNOGLOBULIN ^ 
FRAGMENT, BENCE-JONESvl 
2 PROTEIN, IMMUNE J 
SYSTEM M 


ANTIBODY THERAPEUTIC^ > 
ANTIBODY, CD52 © 

0 


IMMUNE SYSTEM FAB-IBP 
COMPLEX CRYSTAL RJ 
STRUCTURE 2.7A fU 
RESOLUTION BINDING 2 | 


Compound 


ZINC FENGER PROTEIN 
GLI1; CHAIN: A; DNA; 
CHAIN: C,D; 




IMMUNOGLOBULIN; 
CHAIN: A, B; 




FAB FRAGMENT; CHAIN: 
L, H, J ? K; VASCULAR 


FACTOR; CHAIN: V, W; 


HULYSl 1; CHAIN: A, B, D, 
E; LYSOZYME; CHAIN: C, 
F; 


IG KAPPA CHAIN V-I \ 


REGION REI; CHAIN: A, B; 


CAMPATH-1H:UGHT 
CHAIN; CHAIN: L; 
C AMPATH- 1H:HEAVY 
CHAIN; CHAIN: H; 




IGM RF 2A2; CHAIN: A, C, 
E; IGM RF 2A2; CHAIN: B. 
D, F; IMMUNOGLOBULIN 


< 

I 


SEQFOLD 
score 


















PMF 
score 


0.89 




CO 

© 


0.17 


0.47 


0.16 


0.29 


0.27 


Verify 
score 


0.11 




-0.33 


, -0.05 


-0.56 


9 


-0.00 


-0.12 


Psi 
Blast 


3.4e-32 
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i> 
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CO 
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CO 


Ti- 
ro 

6 

CO 
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•o 
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CO 
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1— t 


s 


8 
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AA 
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CO 


CO 


Tf 
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a 


CO 
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ID 
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< 
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lcel 1 
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r- 

00 
CM 




oo 

00 
CM 


00 

oo 
CM 


00 
00 
CM 


oo 

00 
CM 


oo 
oo 
CM 


00 
00 
CM 



391 



WO 02/081731 



PCT/US02/01222 




O U go co 




ob/'ii.lelb.e: 




! 










s 

i 



to 

(0 



€1 

^ to 



NO 



^ to 

s 5 



h 



e .. 

CO 



s 



s 



cr 
•a 



T3 



392 



WO 02/081731 



PCT/US02/01222 




Bill 



< u 







OJl2BE: 



O 






s § 



s 



o 



is £ 



2-= 

* CO 



13 



a 



g9 



8 
d 



s 



2 



00 

o 
o 



I 



00 
On 



On 



m 
oo 



to 



6 



OS 

o 



On 



393 



WO 02/081731 



PCT/US02/01222 




394 



WO 02/081731 



PC17US02/01222 




§ 

o 
U 



PQ 





o 

ON 



10 



§ 



8 



> M 



^ PQ 



a 



A* 



85 



1 



oo 



m 

ON 



u 



s 



395 



WO 02/081731 



PCT/US02/01222 




396 



WO 02/081731 



PCT/US02/01222 






§ 

t 

© 

u 





PQ 




tN 
CO 

od 
r- 



IT) 



8 



t 

« _ 
> » 



_5 



oo 

«✓> 

O 



VO 

d 



5 OB 

*5 



s 

0 



8a 



9 o 

ars 

CO 



a 



s 

I 



VO 



00 



o 

T3 



397 




398 



WO 02/081731 



PCT/US02/01222 




399 



WO 02/081731 



PCT/US02/01222 



PDB annotation 


SUBGROUP WITHIN IG-LKE 
DOMAINS, B -TREFOIL FOLD 






LIGASE CBL, UBCH7, ZAP- ^ 1 
70,E2,UBIQUITIN,E3, 
PHOSPHORYLATION, 2 
TYROSINE KINASE, 
UBIQUTTINATION, PROTEIN 
DEGRADATION, 


METAL BINDING PROTEIN 
RING FINGER PROTEIN 
MATl; RING FINGER 
(C3HC4) 


DNA-BINDING PROTEIN 
V(D)J RECOMBINATION 
ACTIVATING PROTEIN 1; 
RAGl, V(D)J 
RECOMBINATION, 
ANTIBODY, MAD, RING 
FINGER, 2 ZINC BINUCLEAE 
CLUSTER, ZINC FINGER, . H 
DNA-BINDING PROTEIN Hi 


0 


DNA INTEGRATION DNA Q 
INTEGRATION, AIDS, <n 
POLYPROTEIN. q 
HYDROLASE, 2 S{ 
ENDONUCLBASE, (V 
POLYNUCLEOTIDYL \ 
TRANSFERASE. DNA gj 
BINDING 3 (VIRAL) y. 


DNA INTEGRATION DNA hi 
INTEGRATION, AIDS, 
POLYPROTEIN, |M 


Compound 






VIRUS EQUINE HERPES 
VIRUS-1 (C3HC4, OR RING 
DOMAIN) ICHC 3 (NMR, I 
STRUCTURE) ICHC 4 


SIGNAL TRANSDUCTION 
PROTEIN CBL; CHAIN: A; 
ZAP-70 PEPTIDE; CHAIN: 
B; UBIQUITIN- 
CONJUGATING ENZYME 
E12-18 KDA UBCH7; 
CHAIN: C; 


CDK-ACTJVATING 
KINASE ASSEMBLY 
FACTOR MATl; CHAIN: A; 


RAGl; CHAIN: NULL; 




HIV-1 INTEGRASE; CHAIN: 




I INTEGRASE; CHAIN: A, B, 


O 


SEQFOLD 
score 




















PL 


score 






0.83 


0.52 


0.27 


0.65 




0.22 


0.06 


1 


score 






0.02 


-0.34 


0.30 


0.03 




-0.90 


-0.63 


I 


Blast 






1.4e-12 


CN 
»— f 

4 


? 




VO 

& 
Ti- 
ro* 




cn 

00 

vd 


lc-13 








cn 

00 


S 


o\ 

00 






On 


5? 


START 
AA 






P; 


ON 

cn 


CO 


On 

CO 




cn 


CN 


s 


e 








< 


<: 








u 








lchc 


lfbv 




lrmd 




lbhl 


lb!3 
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NO: 
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PDB annotation 




MUSCLE PROTEIN MDE; 
MUSCLE PROTEIN 


MUSCLE PROTEIN MDE; 1 
MUSCLE PROTEIN 40 


MUSCLE PROTEIN MUSCLE ^ 1 
PROTEIN 


MUSCLE PROTEIN MUSCLE 
PROTEIN 


CONTRACTILE PROTEIN 
MYOSIN MOTOR, 
CONFORMATIONAL 
CHANGES 


CONTRACTILE PROTEIN 
MYOSIN, DICTYOSTELIUM, 
MOTOR, MANT, ATPASE, 
ACTIN-BINDING, 2 COILED 
COIL 


1 CONTRACTILE PROTEIN 1 


MYOSIN, DICTYOSTELIUM, 
MOTOR, MANT, ATPASE, 
ACTIN-BINDING, 2 COILED ^ 
COIL n 


CONTRACTILE PROTEIN '\\ 
ATPASE, MYOSIN, COILED fl 
COIL, ACTIN-BINDING, ATPJ| 
BINDING, 2 HEPTAD G 
REPEAT PATTERN, 
METHYLATION, gj 
ALKYLATION, 3 flj 
PHOSPHORYLATION, 
CONTRACTILE PROTEIN * 


CONTRACTILE PROTEIN J* 
ATPASE, MYOSIN, COILED H 
COIL, ACTIN-BINDING, ATPRJ 
BINDING, 2 HEPTAD fU 
REPEAT PATTERN, j=jj 


Compound 


CHAIN; CHAIN: Y; 
MYOSIN ESSENTIAL 
LIGHT CHAIN; CHAIN: Z; 


MYOSIN; CHAIN:.A, B, C. 
D,E,F,G,H; 


MYOSIN; CHAIN: A, B, C, 
D.E.RG, H; 


MYOSIN; CHAIN: A, B, C, 
D,E,F; 


MYOSIN; CHAIN: A, B, C, 
D. E, F; 


MYOSIN HEAD; CHAIN: A; 
MYOSIN HEAD; CHAIN: Y; 
MYOSIN HEAD; CHAIN: Z; 


MYOSIN; CHAIN: NULL; 


1 MYOSIN; CHAIN: NULL; 1 




MYOSIN; CHAIN: NULL; 


MYOSIN; CHAIN: NULL; 


SEQFOLD 
score 






451.74 




417.00 




420.64 




370.75 




PMF 
score 




1.00 




1.00 




1.00 




1.00 




1.00 


Verify 
score 




0.56 




0.57 




j 0.45 




0.29 




0.71 


Psl 
Blast 
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o 


o 


o 


o 


o 


o 


o 
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SO 
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OO 


1 


S 
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i 
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AA 
















CO 




CO 


CHAIN 
ID 




< 


< 


< 


< 


< 










6" 




Ibrl 


Ibrl 






ldfk 

i 


llvk 


llvk 


Imnd 


Imnd 


SEQ ID 
NO: 




S 
co 


S 

CO 


s 

co 


8 

CO 


8 

CO 


8 

CO 
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CO 


8 
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PDB annotation 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION. PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) f] 


COMPLEX (ZINC M 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA ^ 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL * 
STRUCTURE, COMPLEX , . 
(ZINC FINGER/DNA) ■ 


COMPLEX (ZINC W 
FINGER/DNA) ZINC FINGER l/l 
PROTEIN-DNA |g 
INTERACTION, PROTEIN ft 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX J 
(ZINC FINGER/DNA) H 


LSiSc 


Compound ■ 


DNA; CHAIN: A, B f D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C,F,G; 


DNA; CHAIN: A,B,D,E; 
, CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A.B.D, E; 
CONSENSUS ZtNC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B,D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F. G; 


TRANSCRIPTION 
REGULATION YEAST 
TRANSCRIPTION FACTOR 
ADRl (RESIDUES 130 - 159) 


SEQFOLD 
score 










111.82 




PMF 
score 


0.60 


0.12 


1.00 


1.00 




0.60 


Verify 
score 


0.42 


0.27 


0.43 


0.29 




-0.41 


Psi 
Blast 
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fS 
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R 
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AA 


OO 
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PDB annotation 


PROTEIN, DNA, 
TRANSCRIPTION FACTOR, 
5S RNA 2 GENE, DNA 
BINDING PROTEIN, ZINC 
FINGER, COMPLEX 3 
(TRANSCRIPTION 
REGULATION/DNA) til 


COMPLEX (TRANSCRIPTION" 9 
REGULATION/DNA) TFIIIA; 
5S GENE; NMR, TFIIIA, 
PROTEINfDNA, 
TRANSCRIPTION FACTOR, 
5S RNA 2 GENE, DNA 
BINDING PROTEIN, ZINC 
FINGER, COMPLEX 3 
(TRANSCRIPTION 
1 REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE m, 2 
TRANSCRIPTION 
INITIATION, ZINC FINGER ^ 
PROTEIN F=i 


llliggi 

p lips- 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) B 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA f%J 
POLYMERASE m, 2 fij 
TRANSCRIPTION 


Compound 




TRANSCRIPTION FACTOR 1 


IIIA; CHAIN: A; 5S RNA 


U 
Q 


TFIIIA; CHAIN: A,D;5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C,E,F; 


TFIIIA;CHAIN:A,D;5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C,E,F; 


Pi 


SEQFOLD 
score 




60.98 








PMF 
score 






-0.06 


0.75 


0.12 


Verify 
score 






0.15 


0.09 


0.37 
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ID 
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PDB annotation 


ANTIBODY DESIGN, 
IMMUNOGLOBULIN 2 
STRUCTURE, ANTIGEN- 
BINDING SITE, CANONICAL 
CONFORMATION, 3 
COMPLEMENTARITY- - 
DETERMINING REGION- ill 


IMMUNOGLOBULIN INTACT 
IMMUNOGLOBULIN, V 
REGION, C REGION, HINGE 
REGION 


i M 

1 




COMPLEX . J 
(TMMUNOGLOBULIN/HYDR^ 
OLASE) N10 FAB 01 
IMMUNOGLOBULIN; INSN 1P 
STAPHYLOCOCCAL fy 
NUCLEASE \ 
RIBONUCLEATE, INSNll « 
IMMUNOGLOBULIN, W 
STAPHYLOCOCCAL 
NUCLEASE INSN 25 fU 


MONOCLONAL ANTIBODY fU 
MONOCLONAL ANTIBODY,pj 


Compound 




IGGl INTACT ANTIBODY 
MAB61.1.3; CHAIN: A, B, C, 


Q 


IMMUNOGLOBULIN 
CHA255 . 

IMMUNOGLOBULIN FAB' 
FRAGMENT (IGGl- 
LAMBDA) COMPLEX 1IND 
3 WITH 4-[N-(2- 
HYDROXYETHYL)- 


3 8 

ill 
llf 

D g »n 


IMMUNOGLOBULIN FAB 
D44.1 (IGGIJCAPPA) 
(BALB/C MOUSE, 
MONOCLONAL 
ANTIBODY) IMLB 5 


IGGFABaGGl, KAPPA); 
INSN 4 CHAIN: L, H; INSN 
5 STAPHYLOCOCCAL 
NUCLEASE; INSN 9 
CHAIN: S; INSN 10 


MONOCLONAL 
ANTIBODY 3A2; CHAtN: H, 


SEQFOLD 
score 


• 












PMF 
score 




0.43 


2 


0.45 


0.84 


0.54 


* s 
> ^ 




-0.29 


0.09 


0.31 


0.03 


-0.07 
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Blast 
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PDB annotation 


ENDOCYTOSIS/EXOCYTOSI 
S G PROTEIN, VESICULAR 
TRAFFIC, GTP 
HYDROLYSIS, YPT/RAB 2 
PROTEIN, ENDOCYTOSIS, 
HYDROLASE 


GTP-BINDING PROTEIN U 
GTP-BINDING PROTEIN, 
SMALL G PROTEIN, RAP2, 
GDP, RAS 


GTP-BINDING PROTEIN 
GTP-BINDING PROTEIN, 
SMALL G PROTEIN, RAP2, 
GDP, RAS 


GTP-BINDING GTP- 
BINDING, GTPASE, SMALL 
G-PROTEIN, RHO FAMILY, 
RAS SUPER 2 FAMILY 


COMPLEX (GTP- 
BINDING/EFFECTOR) RAS- 
RELATED PROTEIN RAB3A; 
COMPLEX (GTP- 
BINDING/EFFECTOR), G 
PROTEIN. EFFECTOR. V 
RABCDR, 2 SYNAPTIC f) 
EXOCYTOSIS,RAB ^ 
PROTEIN, RAB3A, □ 
RABPHILIN Jl 


HYDROLASE G PROTEIN, J*J 
VESICULAR TRAFFICKING.^ 
GTP HYDROLYSIS, RAB 2 P 
PROTEIN, fU 
NEUROTRANSMITTER 
RELEASE, HYDROLASE sA 


1: 


1 COMPLEX (MHC/VIRAL JTI 


PEPTIDE/RECEPTOR) HL A- JN 
A2 HEAVY CHAIN; CLASS ifU 
MHC. T-CELL RECEPTOR, fU 


Compound 


GTP-BINDING PROTEIN 
YPT51; CHAIN: A; 


RAP2A; CHAIN: NULL; 






RACl; CHAIN: NULL; j 




RAB-3A; CHAIN: A; 
RABPHILIN-3A; CHAIN: B; 


RAB3A; CHAIN: A; 




1 HLA-A 0201; CHAIN: A; 


BETA-2 MICROGLOBULIN; 
CHAIN: B; TAX PEPTIDE; 
CHAIN: C;TCELL 


SEQFOLD 
score 




68.65 




! 








73.76 


PMF 
score 


0.29 




zz/o 


o 
9 


0.34 


0.36 






Verify 
score 


0.41 




0.14 


0.10 


0.05 


-0.08 
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Izbd 


3rab 
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PDB annotation 1 


IMMUNOGLOBULIN 
FRAGMENT, BENCEJONES 
2 PROTEIN. IMMUNE 
SYSTEM 


IMMUNE SYSTEM FAB-IBP 


COMPLEX CRYSTAL f 
STRUCTURE 2.7A A 
RESOLUTION BINDING 2 ™ 
OUTSIDE THE ANTIGEN 
COMBINING SITE 
SUPERANTIGEN FAB VH3 3 
SPECIFICITY 




IMMUNE SYSTEM 
IMMUNOGLOBULIN FOLD, 
ANTIBODY, IGM,FV 


PE 






lsbe: 
i 


Compound 




IGM RF 2A2; CHAIN: A, C. 
! E; IGM RF 2A2; CHAIN: B, 
D, F; IMMUNOGLOBULIN 
G BINDING PROTEIN A; 
CHAIN: G,H; 


IMMUNOGLOBULIN 3D6 
FAB IDFB 3 


IGMMEZ 

IMMUNOGLOBULIN; 
CHAIN: L; IGMMEZ 
IMMUNOGLOBULIN; 
CHAIN: H; 


IMMUNOGLOBULIN FV 
FRAGMENTOFA 
HUMANIZED VERSION OF 
THE ANTI-CD 18 IFGV 3 
ANTIBODY -H52' (HUH52- 
AAFV)1FGV4 


IMMUNOGLOBULIN FV 
FRAGMENTOFA 
HUMANIZED VERSION OF 
THE ANTI-CD18 IFGV 3 
ANTIBODY H52' (HUH52- 
AAFV) IFGV 4 


IMMUNOGLOBULIN 
IMMUNOGLOBULIN M 
(IG-M)FV FRAGMENT 
1IGM 3 


IMMUNOGLOBULIN 
IMMUNOGLOBULIN M 
(IG-M) FV FRAGMENT 
1IGM3 


SEQ FOLD 
score 










55.38 




52.10 




PMF 
score 
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1.00 


0.99 




0.99 




1.00 


Verify 
score 
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0.43 
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PDB annotation 


€ 


> 


TRANSFERASE 

TRANSFERASE, 

GLUTATHIONE, 

CONJUGATION, 

DETOXIFICATION, 2 

CYTOSOLIC.DIMER 




GENE REGULATION POZ 
DOMAIN; PROTEIN- 
PROTEIN INTERACTION 
DOMAIN,^ ^ 
TRANSCRIPTIONAL 2 n 
REPRESSOR, ZINC-FINGER\ 
PROTEIN, X-RAY 7* 
CRYSTALLOGRAPHY, 3 JU 
PROTEIN STRUCTURE, 
PROMYELOCYTIC tfl 
LEUKEMIA, GENE P 
REGULATION fli 




s 


1 COMPLEX (ZINC |%J ( 


I FINGER/DNA) COMPLEX f|J 1 


Compound 


A IGNE 3 CONSERVED 
NEUTRALIZING EPITOPE 
ON GP41 OF HUMAN IGNE 
4 IMMUNODEFICIENCY 
VIRUS TYPE 1, 
COMPLEXEDWITH 
GLUTATHIONE IGNE 5 


GLUTATHIONE 
TRANSFERASE 
GLUTATHIONES- 
TRANSFERASE 
(E.C.2.5.1.18)(26KDA) 
1GTA3 


GLUTATHIONES- 
TRANSFERASE; CHAIN: A, 
B, C, D; 




PROMYELOCYTIC 
LEUKEMIA ZINC FINGER 
PROTEIN PLZF; CHAIN: A; 


OXIDOREDUCTASE(OXYO 
EN(A)) GALACTOSE 
OXIDASE (E.C.1.1.3.9) (PH 
4.5) IGOF 3 




QGSR 22NC FINGER 
PEPTIDE; CHAIN: A; 


SEQ FOLD 
score 


















PMF 
score 




© 


© 

d 
i 




0.98 


0.01 




0.36 


Verify 
score 
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00 
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< 
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lgta 


il 

1-H 
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— : — 












COMPLEX (ZINC R| 
FINGER/DNA) ZINC FINGER, R j 
PROTEIN-DNA R i 


PDB annotation 


i (ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGEI 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGE1 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGEI 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGEI 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGEI 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


Compound 




DNA; CHAIN: A,B,D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


SEQFOLD 
score 
















PMF 
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PDB annotation 


COMPLEX (TRANSCRIPTION 
REOULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REOULATION/DNA) Y1NG- 
YANG 1; TRANSCRIPTION 
INITIATION, INITIATOR 
ELEMENT, YYl, ZINC 2 1 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YING- 
YANG 1; TRANSCRIPTION 
INITIATION, INITIATOR 
ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YING- 
YANG 1 ; TRANSCRIPTION 
INITIATION, INITIATOR T 
ELEMENT, YYl, ZINC 2 f 
FINGER PROTEIN, DNA- ' 
PROTEIN RECOGNITION, 3 " 
COMPLEX (TRANSCRIPTION 1 
REGULATION/DNA) t 






1 i 

L|g 

Ii s 8 

lies 

sell 


Compound 




YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 

TVKT A . r^Tl A TXT. A T> . 


Q 

i 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


YYl ; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


COMPLEX(TRANSCRIPTIO 
N REGULATION/DNA) 
TRAMTRACK PROTEIN 
(TWO ZINC-FINGER 


PEPTIDE) COMPLEXED 
WITH 2DRP 3 DNA 2DRP4 


ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 


CHAIN: C,D; 


SEQ FOLD 
score 




91.94 










PMF 
score 
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0.12 • 


1.00 


Verify 
score 
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PDB annotation 


ANTIGEN 1, MAJOR BLOOD- 
STAGE EGF-LKE DOMAIN, 


EXTRACELLULAR, 
MODULAR PROTEIN, 
SURFACE 2 ANTIGEN, 
MALARIA VACCINE 
COMPONENT, SURFACE 1 
PROTEIN 1 


BLOOD COAGULATION, 
SERINE PROTEASE, 
COMPLEX, CO-FACTOR, 2 1 
RECEPTOR ENZYME, 


co 

6* 

W g 

1 
1 


c 

j 
< 

8 

! c 
✓ «»■» 


j AND) 




! BLOOD CLOTTING FACTOR 
Vn, BLOOD COAGULATION, 
EGF-LIKE DOMAIN, BLOOD 
2 CLOTTING * 


BLOOD CLOTTING f 
COMPLEX(SERINE 
PROTEASE/COFACTOR^IG T 
AND), BLOOD j 
COAGULATION, 2 SERINE 
PROTEASE, COMPLEX, CO- ( 
FACTOR, RECEPTOR { 
ENZYME, 3 INHIBITOR, f 
GLA, EGF, COMPLEX \ 
(SERINE 4 f 
PROTEASE/COFACTOR/LIG * 
AND), BLOOD CLOTTING & 


PLASMINOGEN f 
ACTIVATION \ 


Compound 




BLOOD COAGULATION 
FACTOR VIIA; CHAIN: L, 
H; SOLUBLE TISSUE 
FACTOR; CHAIN: T, U; D- 
PHE-PHE-ARG- 
CHLOROMETHYT.KETONR 


(DFFRCMK) WITH CHAIN: 
C; 


GROWTH FACTOR 
EPIDERMAL GROWTH 
FACTOR (EGF) (NMR, 16 
STRUCTURES) 1EGF3 


BLOOD COAGULATION 
FACTOR VII; CHAIN: A; 


BLOOD COAGULATION 
FACTOR VELA; CHAIN: L; 
BLOOD COAGULATION 
FACTOR VIIA; CHAIN: H; 
SOLUBLE TISSUE 
FACTOR; CHAIN: T; 5L15; 




T-PLASMINOGEN 
ACTIVATOR Fl-G; ITPG 7 
CHAIN: NULL; ITPG 8 


SEQFOLD 
score 














P. 


score 




0.96 | 


0.40 


0.77 


0.64 


0.99 
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0.30 


1 0.18 
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0.45 . 
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PDB annotation 


MEMBRANE PROTEIN C- 
TYPE LECTIN-LIKE 
DOMAINS 


MEMBRANE PROTEIN C- 1 
TYPE LECTIN-LIKE 1 
DOMAINS 1 


HEMATOPOIETIC CELL 
RECEPTOR ACTIVATION 
INDUCER MOLECULE (AIM), 
EA 1, HEMATOPOIETIC 


CELL RECEPTOR, 
LEUCOCYTE, C-TYPE 
LECTIN-LIKE, 2 NKD, KLR 


SUGAR BINDING PROTEIN 
C-TYPE LECTIN, MANNOSE 
RECEPTOR 


COAGULATION FACTOR 
BINDING DC/X-BP 
COAGULATION FACTOR 
BINDING, C-TYPE LECTIN, 
GLA-DOMAIN 2 BINDING, C- 
TYPE CRD MOTIF, LOOP * 
EXCHANGED DIMER f 


COAGULATION FACTOR ^ 
BINDING IX/X-BP 
COAGULATION FACTOR 
BINDING, C-TYPE LECTIN, 
GLA-DOMAIN 2 BINDING, C- J 
TYPE CRD MOTIF, LOOP ( 
EXCHANGED DIMER 1 


COAGULATION FACTOR 
BINDING IX/X-BP f 
COAGULATION FACTOR J 
BINDING, C-TYPE LECTIN, [ 
GLA-DOMAIN 2 BINDING, O f 
TYPE CRD MOTIF, LOOP f 
EXCHANGED DIMER J 


Compound 


FLAVOCETIN-A: ALPHA 
SUBUNTT; CHAIN: A; 
FLAVOCETIN-A: BETA 
SUBUNTT; CHAIN: B 


FLAVOCETIN-A: ALPHA 
SUBUNTT; CHAIN: A; 
FLAVOCETIN-A: BETA 
SUBUNTT; CHAIN: B 


EARLY ACTIVATION 
ANTIGEN CD69; CHAIN: A; 




MACROPHAGE MANNOSE 
RECEPTOR; CHAIN: A, B; 


COAGULATION FACTORS 


3« 
5< 


COAGULATION FACTORS 
DC/X-BINDING PROTEIN; 
CHAIN: A,B,C,D,E,F; 




COAGULATION FACTORS 
DC/X-BINDING PROTEIN; 
CHAIN:A,B,C,D,E,F; 


SEQ FOLD 
score 










59.78 




70.03 


PMF 
score 


0.81 


0.98 


1.00 


0.75 




1.00 




Verify 
score 


0.61 


0.36 


0.91 


0.71 




0.38 
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PDB annotation 


FACTOR/GROWTH FACTOR 
RECEPTOR 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF, 
FGFR, IMMUNOGLOBULIN- 
LIKE, SIGNAL . 
TRANSDUCTIONS f 
DIMERIZATION, GROWTH T 
FACTOR/GROWTH FACTOR 


3 

i 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF, 
FGFR, IMMUNOGLOBULIN- 
LIKE, SIGNAL 
TRANSDUCTION, 2 
DIMERIZATION, GROWTH 
FACTOR/GROWTH FACTOR 
RECEPTOR 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGFl ; 
FGFRl; IMMUNOGLOBULIN 
j (IG) LIKE DOMAINS 
BELONGING TO THE I-SET 2 
SUBGROUP WITHIN IG-UKET \ 
DOMAINS, B-TREFOIL FOLD f \ 


IMMUNE SYSTEM, T« 
MEMBRANE PROTEIN CD32; 3 
FC RECEPTOR, 1 
IMMUNOGLOULIN, $ 
LEUKOCYTE, CD32 (J: 


CONTRACTILE PROTEIN £J 
IMMUNOGLOBULIN FOLD, f i 
BETA BARREL 


MUSCLE PROTEIN 
CONNECTIN, NEXTM5; £ 
CELL ADHESION, J 
GLYCOPROTEIN, PJ 
TRANSMEMBRANE, fi| 
REPEAT, BRAIN, 2 fj 


Compound 




FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; 
CHAIN: CD; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1 ; 
CHAIN: CD; 


FIBROBLAST GROWTH 
FACTOR 1; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; 
CHAIN: CD; 


FC RECEPTOR 
FC(GAMMA)RHA; CHAIN: 
A; 


TELOKIN; CHAIN: A 


TITIN; CHAIN: NULL; 


SEQ FOLD 
score 
















PMF 
score 




0.18 


0.39 


| 0.18 


0.24 


0.64 


0.24 


Verify 
score 
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-0.11 


0.17 


0.27 
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PDB annotation 


BINDING, GPI-ANCHOR, 2 
NEURAL ADHESION 
MOLECULE. 


IMMUNOGLOBULIN FOLD, 
HOMOPHILIC 3 BINDING, 
CELL ADHESION PROTEIN . 




1 COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), ZINC 
FINGER, DNA-BINDING 
PROTEIN 


COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), ZINC 
FINGER, DNA-BINDING 
PROTEIN 


COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), ZINC 
FINGER, DNA-BINDING 
PROTEIN 1 


COMPLEX (ZINC J 
FINGER/DNA) COMPLEX ,1 
(ZINC FINGER/DNA), ZINC J 
FINGER, DNA-BINDING M 
PROTEIN y 


u 

M 

|8|| 
||||| 


u ^ 

5 u 
og 










vS 












Compound 








QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

1 OLIGONUCLEOTIDE 
1 BINDING SITE; CHAIN 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAIN 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAIN 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHADS 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAtt 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 


SEQFOLD 
score 










70.01 








PMF 
score 






0.80 


0.99 




0.69 


0.96 


0.23 


Verify 
score 
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PDB annotation 


FACTOR RECEPTOR FGF, 
FGFR, IMMUNOGLOBULIN- 
LIKE, SIGNAL 
TRANSDUCTION, 2 
DIMERIZATION, GROWTH 
FACTOR/GROWTH FACTOR 
RECEPTOR fl 


GROWTH FACTOR/GROWTH ' 
FACTOR RECEPTOR FGFl; 
FGFRl; IMMUNOGLOBULIN 
(10) LIKE DOMAINS | 
BELONGING TO THE I-SET 2 
SUBGROUP WITHIN IG-LIKE 
DOMAINS, B-TREFOIL FOLD 


CONTRACTILE PROTEIN 
| IMMUNOGLOBULIN FOLD, 
BETA BARREL 


MUSCLE PROTEIN 
CONNECTIN, NEXTM5; 
CELL ADHESION, 
GLYCOPROTEIN, 
TRANSMEMBRANE, 
REPEAT, BRAIN, 2 
IMMUNOGLOBULIN FOLD, V t 
ALTERNATIVE SPLICING, f ' 
SIGNAL, 3 MUSCLE 
PROTEIN 0 




MUSCLE PROTEIN n 
IMMUNOGLOBULIN 
SUPERFAMILY, I SET, r * 
MUSCLE PROTEIN fj 


£ F 




« ffi 




»rt g" 


A* 








H co 




TWTTCHIN 18TH IGSF 
MODULE; CHAIN: NULL; 




Compound 


FACTOR 2; CHAIN: A, 
FIBROBLAST GROW! 
FACTOR RECEPTOR 
CHAIN: C,D; 




FIBROBLAST GROW! 
FACTOR 1; CHAIN: A 
FIBROBLAST GROW! 
FACTOR RECEPTOR 
CHAIN: CD; 


TELOKIN; CHAIN: A 


TITIN; CHAIN: NULL; 




MUSCLE PROTEIN Tl 
MODULE M5 
(CONNECTIN) ITNM 
ftJMR, MINIMIZED 


AVERAGE STRUCTU 
ITNM 4 ITNM 58 


NERVE GROWTH 
FACTOR; CHAIN: V, 


SEQ FOLD 
score 
















PMF 
score 




00 
0 


0.95 


0.59 
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Verify 
score 
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PDB annotation 


SUBUNTT; GAMMAl, 
TRANSDUCIN GAMMA 
SUBUNTT; COMPLEX (GTP- 
BINDING/TRANSDUCER), G 
PROTEIN, HETEROTRIMER 2 
SIGNAL TRANSDUCTION A t 




COMPLEX 

(KINASE/INHIBITOR) CDK6; 
P19INK4D;CYCLIN 
DEPENDENT KINASE, 
CYCUN DEPENDENT 
KINASE INHIBITORY 2 
PROTEIN, CDK, INK4, CELL 
CYCLE, COMPLEX 
(KINASE/INHIBITOR) 
HEADER HELIX 


COMPLEX (INHIBITOR 
PROTEIN/KINASE) 
INHIBITOR PROTEIN, 
CYCLIN-DEPENDENT 
KINASE, CELL CYCLE 2 
CONTROL, ALPHA/BETA, 
COMPLEX (INHIBITOR 1 J 
PROTEIN/KINASE) Pi 


PHOSPHOTRANSFERASE „j 
PROTEIN KINASE ICKI 18 u j 




OXYGEN TRANSPORT * • 
OXYGEN TRANSPORT, W 
HEME, RESPIRATORY £ ) 
PROTEIN, ERYTHROCYTE f\ j 


OXYGEN TRANSPORT 
OXYGEN TRANSPORT, f j 
HEME, RESPIRATORY f [ 
PROTEIN. ERYTHROCYTE L. 


OXYGEN TRANSPORT H* 
OXYGEN TRANSPORT, fl |. 
HEME, RESPIRATORY R I 


Compound 


GAMMA; CHAIN: G; 




CYCLIN-DEPENDENT 
KINASE 6; CHAIN: A, C; 
CYCLIN-DEPENDENT 
KINASE INHIBITOR; 
| CHAIN: B,D; 


CYCLIN-DEPENDENT 
KINASE 6; CHAIN: A; 
P19INK4D; CHAIN: B; 


CASEIN KINASE I DELTA; 
ICKI 6 CHAIN: A, B; ICKI 7 




HEMOGLOBIN; CHAIN: A, 
B 


HEMOGLOBIN; CHAIN: A, 
B 


HEMOGLOBIN; CHAIN: A, 
B 


SEQFOLD 
score 
















135.78 


102.38 


PMF 
score 
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0.24 


0.17 




1.00 
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cn 
9 


-0.01 


-0.33 i 




0.65 






Psi 
Blast 






0.00099 

! 


0.0023 


0.0066 




2e-47 




oo 
cn 

6 
cn 
cn 












in 
i— i 




i-H 
Tf 
i-H 




rt 


START 
AA 






CO 
00 


cn 

00 


NO 

cn 






< 




CHAIN 
ID 






< 


< 


< 




< 


< 










00 
JO 


Iblx 


Icki 




3 


la4f 
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PDB annotation 


HUMAN BLOOD, 2 OXYOEN 
TRANSPORT 


OXYGEN TRANSPORT X- 
RAY STUDY, PORCINE 
HEMOGLOBIN, ARTIFICIAL 
HUMAN BLOOD, 2 OXYGEN j 
TRANSPORT 1 




OXYGEN TRANSPORT 
OXYGEN TRANSPORT, 
HEME, RESPIRATORY 
PROTEIN, ERYTHROCYTE 


OXYGEN TRANSPORT 
OXYGEN TRANSPORT, 
HEME, RESPIRATORY 
PROTEIN. ERYTHROCYTE 


OXYGEN TRANSPORT 
OXYGEN TRANSPORT, 
HEME, RESPIRATORY 
PROTEIN, ERYTHROCYTE 


OXYGEN TRANSPORT 
HEME PROTEIN, MODEL 
COMPOUNDS, OXYGEN 
STORAGE, LIGAND 2 *l 
BINDING GEOMETRY, V 
CONFORMATIONAL „, 
SUBSTATES, OXYGEN 3 * 1 
TRANSPORT J 


OXYGEN TRANSPORT J ) 
OXYGEN TRANSPORT < [ 




Compound 


HEMOGLOBIN (BETA 
SUBUNTT); CHAIN: B, D 


PORICINE HEMOGLOBIN 
(ALPHA SUBUNTT); 
CHAIN: A, C; PORICINE 
HEMOGLOBIN (BETA 
SUBUNTT); CHAIN: B,D 




HEMOGLOBIN; CHAIN: A, 
B 


HEMOGLOBIN; CHAIN: A, 
B 


HEMOGLOBIN; CHAIN: A, 
B 


MYOGLOBIN; CHAIN: 
NULL; 


HEMOGLOBIN; CHAIN: A, 
E.C.F; 


OXYGEN TRANSPORT 
HEMOGLOBIN 
THIONVILLE ALPHA 


UxlAJJN mU 1 /uN I Vriin 

VAL 1 IBAB 3 REPLACED 
BYGLUANDAN 
ACETYLATED MET 
BOUND TO THE IB AB 4 
AMINO TERMINUS IBAB 5 


SEQ FOLD 
score 




133.84 






106.59 


87.72 




93.82 
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0.28 
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PDB annotation 










OXYGEN- v 
STORAGE/TRANSPORT HB 
D; HB D HEMOGLOBIN D (R- 
STATE) 1, HEMOGLOBIN, 
AVIAN. HIGH 2 
COOPERATIITY, OXYGEN 
TRANSPORT 


OXYGEN 

STORAGE/TRANSPORT HB 
D; HB D HEMOGLOBIN D (R- 
STATE) 1. HEMOGLOBIN, V 
AVIAN, HIGH 2 f) 
COOPERATHTY, OXYGEN m I 
TRANSPORT A 


? Q 5 S 
b s § P 

o wq5<up 




Eg 


Compound 


OXYGEN TRANSPORT 
HEMOGLOBIN flDEOXY. 


HUMAN FETAL F=/H$=) 
IFDHGl 1FDHH2 


OXYGEN CARRIER 
HEMOGLOBIN (DEOXY) 
IHBH 3 


OXYGEN CARRIER 
HEMOGLOBIN (DEOXY) 
IHBH 3 


OXYGEN CARRIER 
HEMOGLOBIN (DEOXY) 
IHBH 3 


HEMOGLOBIN D: CHAIN: 1 


A, C; HEMOGLOBIN D; 
CHAIN: B,D; 


HEMOGLOBIN D; CHAIN: 
A, C; HEMOGLOBIN D; 
CHAIN: B,D; 


HEMOGLOBIN D; CHAIN: 
A, C; HEMOGLOBIN D; 
CHAIN: B. D: 




OXYGEN TRANSPORT 
HEMOGLOBIN (DEOXY) 
IHDA 3 


OXYGEN TRANSPORT 
HEMOGLOBIN (DEOXY) 


SEQFOLD 
score 


100.14 ! 




101.11 


76.75 




117.48 


81.42 




105.99 
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1.00 




Verify 
score 




0.16. 






0.75 






0.49 




Psi 
Blast 


cn 
cn 

o 

cn 


Os 
cn 

A 
cn 

CO 


OS 

cn 
6 

CO 
CO 


VO 
CO 

SO 
NO 


OS 
CO 

M 


3 


3.3t>32 


o 

CO 


i 

CO 




Os 

♦ 


On 


Os 


OS 

r- 


OS 


Os 


VO 


ON 

r* 


Pi 


START 
AA 


Os 
cn 


cs 


OS 
CO 


00 
CO 




CO 


CO 




o 


CHAIN 
ID 


o 


< 


< 


CQ 


< 


< 


pa 




< 


QQ - 

ge 


Ifdh 


Ihbh 


Ihbh 


Ihbh 


Ihbr 


Ihbr 


Ihbr 


Ihda 


Ihda 


SEQ ID 
NO: 


cn 


3 


CO 


3 


m 
cn 


CO 


in 

3 


m 
co 





461 



WO 02/081731 



PCI7US02/01222 




462 



WO 02/081731 



PCT/US02/01222 





HJSOBv 




OlEgH 

li 
II 

1 

1 1|| 



O 

a 



ar 

CO 










3 



f I 



5 



CO 



1 



I 



T3 



•8 



463 



WO 02/081731 



PCT/US02/01222 





lass- 




i FCT 






§ 

! 







sr 

CO 



8 



o 



d 



O 



s 



PS 



CO 



Is 



OO 



5? 



•a 



o 
>© 
cn 



464 



WO 02/081731 



PCT/US02/01222 




465 



WO 02/081731 



PCT/US02/01222 




466 



WO 02/081731 PCT/US02/01222 



OJLBBiEl 




T3 
G 

O 

I 




Q 2 



s 



2 2 i E s z - - 





< < « 



In S n ^ ^ in 




u 



S 8 
> a 



'55 cd 



s 

u 



C4 



CO 



467 



WO 02/081731 



PCT/US02/01222 




468 



WO 02/081731 



PCI7US02/01222 




469 



WO 02/081731 



PCT/US02/01222 



PDB annotation 


BINDING PROTEIN . 1 




HYDROLASE ERA, GTPASE, I 


RNA-BINDING, RAS-LIKE, 
HYDROLASE 




TRANSLATION EF-TU; 
GTPASE, MOLECULAR 
SWITCH, TRNA, RIBOSOME, 
Q-BETA REPLICASE, 2 
CHAPERONE, DISULFIDE 
ISOMERASE 


TRANSLATION EF-G; BENT 
CONFORMATION, VISIBLE 
DOMAIN HI, MUTATION 
HIS573ALA 


TRANSLATION 
TRANSLATION AL GTPASB % 


PROTEIN BINDING EF-G; EFU 
G ELONGATION FACTOR, -J 
TRANSLOCASE, RIBOSOME J 
ELONGATION; 2 ^ 
TRANSLATION, PROTEIN « 
S YNT FACTOR* GTPASE, Z 
GTP BINDING, 3 ft 
GUANOSINE NUCLEOTIDE \ 


u: 

I 

|| 




COMPLEX (ZINC »v 
FINOER/DNA) COMPLEX H 
(ZINC FINOER/DNA). ZINC ft 


Compound 






GTP-BINDING PROTEIN 1 


ERA; CHAIN: A.B; 


TRANSPORT AND 
PROTECTION PROTEIN 
ELONGATION FACTOR TU 
(DOMAIN I) - 
♦GUANOSINE 
DIPHOSPHATE lETU 4 
COMPLEX lETU 5 


ELONGATION FACTOR TU 
(EF-TU); CHAIN: A; 


ELONGATION FACTOR G; 
CHAIN: A; 




TRANSLATION 
INITIATION FACTOR 
IF2/EIF5B; CHAIN: A; 


ELONGATION FACTOR G; 
CHAIN: A; ELONGATION 
FACTOR G DOMAIN 3; 

HHATN- R; 






B g 
o c 

N 2 


: 

il 


SEQFOLD 
score 






















PMF 
score 
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-0.14 
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score 






0.12 


0.21 
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COMPLEX (ZINC 
HNGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION; PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 






^Lp- ^? mT^ ~ ""^ 




PDB annotation 


COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), ZINC 
FINGER, DNA-BINDING 
PROTEIN 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGEB 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC HNGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGEF 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGEF 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 


Compound 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 
C; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


dna; chain: a, b, d, e; 
consensus Zinc finger 
protein; chain: c, f, g; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


2 FOLD 
;core 
















sr 

CO 
















PMF 
score 
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00*1 
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score 
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PDB annotation 


STRUCTURE. COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 

INTERACTION, PROTEIN g . 
DESIGN, 2 CRYSTAL f 
STRUCTURE, COMPLEX T 
(ZINC FINGER/DNA) I 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
1 STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC ~\ 
FINGER/DNA) ZINC FINGER,f 
PROTEIN-DNA m 
INTERACTION, PROTEIN wi 
DESIGN, 2 CRYSTAL J 
STRUCTURE. COMPLEX H 
(ZINC FINGER/DNA) V 


COMPLEX (ZINC It 
FINGER/DNA) ZINC FINGER 
PROTEIN-TDNA \ 
INTERACTION, PROTEIN « 
DESIGN, 2 CRYSTAL g 
STRUCTURE, COMPLEX F 
(ZINC FINGER/DNA) H 


COMPLEX (ZINC ll 
FINGER/DNA) ZINC FINGERfl 


Compound 




DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A.B.D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; . 
CONSENSUS Z3NC FINGER 


SEQFOLD 
score 




107.60 












PMF 
score 
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1.00 


Verify 
score 
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PDB annotation 


PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 1 
STRUCTURE, COMPLEX 
(ZINCFINGER/DNA) 


COMPLEX (ZINC 1 
FINGER/DNA) ZINC FINGER, ■ 
PROTEIN-DNA ^ 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL X 
STRUCTURE, COMPLEX f 
(ZINC FINGER/DNA) _ 


COMPLEX (ZINC ,A 
FINGER/DNA) ZINC FINGER, 1 
PROTEIN-DNA C 
INTERACTION, PROTEIN fcf 
DESIGN, 2 CRYSTAL C 
STRUCTURE, COMPLEX ft 
(ZINCFINGER/DNA) V 


COMPLEX (ZINC p 
FINGER/DNA) ZINC FINGER^ 
PROTEIN-DNA 5 
INTERACTION, PROTEIN Ft 
DESIGN, 2 CRYSTAL jft 
STRUCTURE. COMPLEX f| 




PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A,B,D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F f G; 


DNA; CHAIN: A,B,D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A,B,D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C.F.G; 


Compound 
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PDB annotation 


SIGNALING PROTEIN GTP- 
BINDING PROTEINS, 
PROTEIN-PROTEIN 
COMPLEX, EFFECTORS 


SIGNALING PROTEIN GTP- J] 
BINDING PROTEINS, ~ 
PROTEIN-PROTEIN 
COMPLEX, EFFECTORS 


SIGNALING PROTEIN G 
PROTEIN, GTP 
HYDROLYSIS, KINETIC 
CRYSTALLOGRAPHY, 2 
SIGNALING PROTEIN 


SIGNALING PROTEIN G 
PROTEIN, GTP 
HYDROLYSIS, KINETIC 
CRYSTALLOGRAPHY, 2 
SIGNALING PROTEIN 


SIGNALING PROTEIN 
PROTEIN-PROTEIN ^ 
COMPLEX, ANTIPARALLELpj 
COILED-COBL \ 


ENDOCYTOSIS/EXOCYTOSO 
S G-PROTEIN. GTPASE, M 
RAB6, VESICULAR Q 
TRAFFICKING tfll 


TRANSLATIONALOTPASE P 
EF-G RIBOSOMAL fU 
TRANSLOCASE, 
TRANSLATIONAL GTPASE 


G PROTEIN G PROTEIN, 
RAS, ARF, ARF6, 
MEMBRANE TRAFFIC fU 


Pi 

go 


Compound 


RAS-RELATED PROTEIN 
RAP-IA; CHAIN: A; 
PROTO-ONKOGENE 
SERINE/THREONINE 
PROTEIN KINASE CHAIN: 
B; 


RAS-RELATED PROTEIN 
RAP-IA; CHAIN: A; 
PROTO-ONKOGENE 
SERINE/THREONINE 
PROTEIN KINASE CHAIN: 
B; 


! TRANSFORMING PROTEIN 
P21/H-RAS-1; CHAIN: A; 


TRANSFORMING PROTEIN 
P21/H-RAS-1; CHAIN: A; 




HIS-TAGGED 

TRANSFORMING PROTEIN 
RHOA(0-181);CHAIN:A; 
PKN; CHAIN: B; 


< 
< 

i 

VC 


r 

I 
1 
1 

! 
> 


ELONGATION FACTOR G; 




ADP-RIBOSYLATION 
FACTOR 6; CHAIN: A; 


GTP-BINDING PROTEIN 
YPT51; CHAIN: A; 


SEQFOLD 
score 


58.13 
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score 
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0.59 
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d 

s 

5 
e 

§ 

e 


(GTPASE 

ACTIVATIOMPROTO- 
ONCOGENE), GTPASE, 2 
TRANSITION STATE, GAP 


COMPLEX (GTP- 
BINDING/EFFECTOR) RAS- 
RELATED PROTEIN RAB 3 A; 
COMPLEX (GTP- | 
BINDING/EFFECTOR), G 1 ' 
PROTEIN, EFFECTOR, 
RABCDR, 2 SYNAPTIC 
EXOCYTOSIS.RAB 
PROTEIN, RAB3A, 
RABPHILIN 


j COMPLEX (GTP- 

! BINDING/EFFECTOR) RAS- 

RELATED PROTEIN RAB3A; 

COMPLEX (GTP- 

BINDING/EFFECTOR), G 

PROTEINJ2FFECTOR, 

RABCDR, 2 SYNAPTIC 

EXOCYTOSIS.RAB 

PROTEIN, RAB3A, 

RABPHILIN 1 


PROTEIN BINDING EF-G; BFf? 
G ELONGATION FACTOR, *' 
TRANSLOCASE, RIBOSOMEn 
ELONGATION. 2 ^ 
TRANSLATION, PROTEIN C 
SYNT FACTOR, GTPASE, U 
GTP BINDING. 3 £ 
GUANOSINE NUCLEOTIDE ft 
BINDING,, PROTEIN V 
BINDING « 


HYDROLASE G PROTEIN, ^ 
VESICULAR TRAFFICKING, H 
GTP HYDROLYSIS, RAB 2 ft 
PROTEIN, fl 
NEUROTRANSMITTER ^ 






■T 




ELONGATION FACTOR G; 
CHAIN: A; ELONGATION 
FACTOR G DOMAIN 3; 
CHAIN: B; 




Compound 




§<: 
s *? 


RAB-3A; CHAIN: A; 
RABPHDLIN-3A; CHAIN 


RAB3A; CHAIN: A; 
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PDB annotation 


PROTEIN/DNA) FIVE- 
FINGER GLI; GLI, ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GLI; GLI, ZINC 
FINGER, COMPLEX (DNA- . 
BINDING PROTEIN/DNA) S 




SERINE ESTERASE 
| HYDROLASE, SERINE 
! ESTERASE, GLYCOPROTEIN 




RNA-BINDING 
PROTEIN/RN A TRA PRE- 
MRNA; SPLICING 
REGULATION, RNP 
DOMAIN, RNA COMPLEX 


GENE REGULATION/RNA 
POLY(A) BINDING PROTEIN 
1, PABP 1; RRM, PROTELN- 
RNA COMPLEX, GENE 
REGULATION/RNA % 

.ess. 


RNA BINDING PROTEIN K 
RNA-BINDING DOMAIN 2 


STRUCTURAL PROTEIN Jt| 
PROTEIN C23; RNP, RBD, U| 
RRM, RNA BINDING S 
DOMAIN, NUCLEOLUS ft! 


STRUCTURAL PROTEIN \ 
PROTEIN C23; RNP, RBD, p 
RRM, RNA BINDING y 
DOMAIN, NUCLEOLUS IT 


Si 


Compound 


GLI1; CHAIN: A; DNA; 
CHAIN: C,D; 


ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 
CHAIN: C, D; 




CUTINASE; CHAIN: NULL; 




ml 

ml 


POLYDENYLATE BINDING 
PROTEIN 1; CHAIN: A, B, 
C,D,E,F, G, H; RNA (5- 
R(*AP*AP*AP*AP*AP*AP* 
AP*AP*AP*AP*A)-3'); 
CHAIN: M, N, O.P,Q,R. S, 
T; 


HU ANTIGEN C; CHAIN: A; 


NUCLEOLIN RBD1; 
CHAIN: A; 


NUCLEOLIN RBD2; 
CHAIN: A; 


1 HNRNP Al; CHAIN: NULL; 




SEQFOLD 
score 
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0.92 

l 
i 
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0.35 


Psl 
Blast 




i 

co 




0.002 




CO 
»-H 

& 


CO 


Tt 
r-i 

vd 


CO 

6 

CO 

•-3 


3 

CO 


OO 

f-H 

f-4 






m 
»— < 
co 




On 
CO 
SO 




cs 

r-4 




CO 

*-H 


CO 

r- 

f-H 


CO 

r- 


o 

f-H . 


START 
AA 




CO 




§ 




OO 

o 


on 


o 

ft 


8 


o 
o 


8 

f— • 


CHAIN 
ID 




< 








< 


< 


< 




< 




la 




2gli 




lcex 




.o 


lcvj 


M 
OO 
T3 

f»H 




o 

f-4 


lhal 


ID 






























CO 




CO 




NO 

r* 

CO 


VO 

r» 

CO 


VO 

& 


VO 
CO 


VO 

r- 
co 


VO 
CO 





489 



WO 02/081731 



PCT/US02/01222 




490 



WO 02/081731 



PCT/US02/01222 



PDB annotation 


RNA BINDING PROTEIN 
RNA-BINDING DOMAIN 


STRUCTURAL PROTEIN 
PROTEIN C23; RNP, RBD, 
RRM, RNA BINDING 
DOMAIN, NUCLEOLUS ! 


NUCLEAR PROTEIN 1 


HETEROGENEOUS j 
NUCLEAR 

RIBONUCLEOPROTEIN Al, 
NUCLEAR PROTEIN, 
HNRNP, RBD, RRM, RNP, 
RNA BINDING, 2 
RIBONUCLEOPROTEIN 


l RNA BINDING PROTEIN 
RNA-BINDING DOMAIN 


I 




RNA-BINDING DOMAIN IV 
RNA-BINDING DOMAIN, ft 
ALTERNATIVE SPLICING R 


Compound 


HU ANTIGEN C; CHAIN: A; 


NUCLEOLINRBDl; 
CHAIN: A; 


HNRNP Al; CHAIN: NULL; 




HETEROGENEOUS 


RIBONUCLEOPROTEIN DO; 
CHAIN: A; 


RNA-BINDING PROTEIN 
SEX-LETHAL PROTEIN (O 
TERMINUS, OR SECOND 
RNA-BINDING DOMAIN 
ISXL 3 (RBD-2), RESIDUES 
199 - 294 PLUS N- 
TERMINAL MET) ISXL 4 
(NMR, 17 STRUCTURES) 
ISXL 5 


RNA-BINDING PROTEIN 
SEX-LETHAL PROTEIN (C- 
TERMINUS. OR SECOND 
RNA-BINDING DOMAIN 
ISXL 3 (RBD-2), RESIDUES 
199 -294 PLUS N- 1 
TERMINAL MET) ISXL 4 
(NMR, 17 STRUCTURES) 
ISXL 5 


SEX-LETHAL PROTEIN; 
CHAIN: NULL; 


SEQFOLD 
score 












51.72 


57.11 


PMF 
score 
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0.99 
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PDB annotation 


(PROTEASE/INHIBITOR) 
COMPLEX 

(PROTEASE/INHIBITOR), 
TISSUE KALLIKREIN, 
SERINE 2 PROTEASE, 
TRYPSIN, PSA, KININ, 
SERPIN 


GLYCOPROTEIN . L 
GLYCOPROTEIN 1 


GLYCOPROTEIN 
GLYCOPROTEIN 


GLYCOPROTEIN 
GLYCOPROTEIN 


SERINE PROTEASE 
INHIBITOR FACTOR XA 
INHIBITOR; ANTISTASIN, 
CRYSTAL STRUCTURE, 
FACTOR XA INHIBITOR, 2 
SERINE PROTEASE 


INHIBITOR, THROMBOSIS 


I It- 




3 


COMPLEX (GTPASE- LT 
ACTTVATING/GTP-BINDINGg 
COMPLEX (GTPASE- R 
ACTIVATING/OTP- '* 
BINDING), GTPASE « 
ACTIVATION G 


TRANSPORT PROTEIN TC4; 
GTPASE, NUCLEAR H 
TRANSPORT. TRANSPORT ^ 
PROTEIN ^ 


Compound 


X, Y; HIRUSTASIN; CHAIN: 
I. J; 






LAMININ; CHAIN: NUIX; 


LAMININ; CHAIN: NULL; 


ANTISTASIN; CHAIN: 




LECTIN (AGGLUTININ) 
WHEAT GERM 
AGGLUTININ (ISOLECTIN 
2)9WGA3 


LECTIN (AGGLUTININ) 
WHEAT GERM 
AGGLUTININ (ISOLECTIN 
2)9WGA3 




P50-RHOGAP; CHAIN: A, B, 
C; CDC42HS; CHAIN: D, E, 
F; 


GTP-BINDING PROTEIN 
RAN; CHAIN: A, B; 


SEQ FOLD 
score 








j 73.03 




60.96 






66.17 


97.94 


PMF . 
score 




-0.02 


0.11 




0.18 




-0.19 








Verify 
score 




0.40 


0.10 
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PDB annotation 


TRANSPORT PROTEIN TC4; 
GTPASE, NUCLEAR 
TRANSPORT, TRANSPORT 
PROTBIN 


TRANSPORT PROTEIN TC4; 
GTPASE, NUCLEAR 
TRANSPORT, TRANSPORT 
PROTEIN 


TRANSPORT PROTEIN TC4; 1 
GTPASE, NUCLEAR 
TRANSPORT, TRANSPORT 
PROTEIN 


SIGNALING PROTEIN GTP- 
BINDING PROTEINS, 
PROTEIN-PROTEIN 
COMPLEX, EFFECTORS 


SIGNALING PROTEIN GTP- 
BINDING PROTEINS, 
PROTEIN-PROTEIN 
COMPLEX, EFFECTORS 


SIGNALING PROTEIN G *S 
PROTEIN, GTP *j 
HYDROLYSIS, KINETIC H 
CRYSTALLOGRAPHY, 2 
SIGNALING PROTEIN Q 


SIGNALING PROTEIN G $ 
PROTEIN, GTP P 
HYDROLYSIS, KINETIC n 
CRYSTALLOGRAPHY, 2 5 u 
SIGNALING PROTEIN s - 


SIGNALING PROTEIN W 
PROTEIN-PROTEIN H 


it 

J 

is 

5 o 
) u 


I EMDOCYTOSIS/EXOCYTOStTJ 


Compound 


GTP-BINDING PROTEIN 
RAN; CHAIN: A, B; 


GTP-BINDING PROTEIN 
RAN; CHAIN: A, B; 


GTP-BINDING PROTEIN 
RAN; CHAIN: A, B; 


j RAS-RELATED PROTEIN 
RAP-1A;CHAIN:A; 
PROTO-ONKOGENB 
SERINEATHREONINE 
PROTEIN KINASE CHAIN: 
B; 


RAS-RELATED PROTEIN 
RAP-IA; CHAIN: A; 
PROTO-ONKOGENE 
SERINE/THREONINE 
PROTEIN KINASE CHAIN: 
B; 


TRANSFORMING PROTEIN 
P21/H-RAS-1; CHAIN: A; 


TRANSFORMING PROTEIN 
P21/H-RAS-1; CHAIN: A; 


HIS-TAGGED 

TRANSFORMING PROTEIN 
RHOA(0-181); CHAIN: A; 
PKN; CHAIN: B; 


RAB6 GTPASE; CHAIN: A; 


SEQFOLD 
score 




93.32 




101.45 






113.07 


87.72 




PMF 
score 


1.00 




0.98 




1.00 


1.00 
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PDB annotation 


COMPLEX 

(INfflBITOR/NUCLEASE) 
COMPLEX 

(INHIBITOIvVNUCLEASE), 
COMPLEX (RI^ANG), 
HYDROLASE 2 MOLECULAR 
RECOGNITION, EPITOPE 
MAPPING, LEUCINE-RICH 3 . 
REPEATS 1 


COMPLEX 

(INHIB ITOR/NUCLE ASE) 
COMPLEX 

(INHIB ITOR/NUCLE ASE), 
COMPLEX (RI-ANG), 
HYDROLASE 2 MOLECULAR 
RECOGNITION, EPITOPE 
MAPPING, LEUCINE-RICH 3 
REPEATS 


COMPLEX 

(INHIB ITOR/NUCLE ASE) 
COMPLEX 

(INHIB ITOR/NUCLE ASE) , 
COMPLEX (RI-ANG), 
HYDROLASE 2 MOLECULAR. 
RECOGNITION, EPITOPE J* 
MAPPING, LEUCINE-RICH 3M 
REPEATS H 


COMPLEX (NUCLEAR \| 
PROTEIN/RNA) COMPLEX Q| 
(NUCLEAR PROTEIN/RNA), 73 1 
RNA, g 
SNRNPJUBONUCLEOPROT^ 


HUH D 
3 < # g 

8 £ ^§ 1 § 


Compound 


RIBONUCLEASE 
INHIBITOR; CHAIN: A, D; 
ANGIOGENIN; CHAIN: B, 
E; 


RIBONUCLEASE 


S « 

si 

it 

5 < « 


1 RIBONUCLEASE 

1 INHIBITOR; CHAIN: A, D; 

ANGIOGENIN; CHAIN: B, 

E; 


U2 RNA HAIRPIN IV; 
CHAIN: Q,R;U2A'; 
CHAIN: A,C;U2B"; 
CHAIN: B,D; 


U2 RNA HAIRPIN IV; 
CHAIN:Q,R;U2A'; 
CHAIN:A t C;U2B B ; 
CHAIN: B.D; 


SEQFOLD 
score 












PMF 
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PDB annotation 


YANG 1; TRANSCRIPTION 
INITIATION, INITIATOR 
ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YING- 
YANG 1; TRANSCRIPTION 
INITIATION, INITIATOR 
ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TIIANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YING- 
YANG 1; TRANSCRIPTION 
INITIATION, INITIATOR 
ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTIONS 
REGULATION/DNA) R 


COMPLEX (TRANSCRIPTTOI^J 
REGULATION/DNA) YING- %J 
YANG 1; TRANSCRIPTION ' 
INITIATION, INITIATOR S | 
ELEMENT, YYl, ZINC 2 *M 
FINGER PROTEIN, DNA- G 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION.. 
REGULATION/DNA) ^ 


TRANSCRIPTION £ 
REGULATION *T 
TRANSCRIPTION 8W 
REGULATION, ADRl, ZINC HJ 
FINGER. NMR f\j 


Compound 


INITIATOR ELEMENT I 


DNA; CHAIN: A, B; 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


1 YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A,B; 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


g 


1 


SEQFOLD 
score 
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Verify 
score 




0.28 


0.33 


0.37 


0.30 


Psi 
Blast 




i 


? 

r- 1 


cn 


*-* 
«n 


M 




1—1 

OO 

t> 






cn 
cn 
oo 


OO 


START 
AA 




m 


00 

s 


© 
cn 


m 


CHAIN 
ID 




u 


U 


U 








lubd 


lubd 


lubd 


■s- 

CN 


SEQID 
NO: 




8 
m 


cn 
o> 
cn 


cn 

ON 

cn 


cn 

ON 

cn 



509 



WO 02/081731 



PCT/US02/01222 



PDB annotation 


TRANSCRIPTION 1 
REGULATION 
TRANSCRIPTION 
REGULATION, ADRl, ZINC 
FINGER, NMR 




COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GU; GLI, ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GLI; GLI, ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GLI; GLI, ZINC % 
FINGER, COMPLEX (DNA- f= 
BINDING PROTEIN/DNA) 




GENE REGULATION POZ J 
DOMAIN; PROTEIN- 9| 
PROTEIN INTERACTION V 1 
DOMAIN, C 
TRANSCRIPTIONAL 2 ft 
REPRESSOR, ZINC-FINGER \ 
PROTEIN, X-RAY P 
CRYSTALLOGRAPHY, 3 J* 
PROTEIN STRUCTURE, r 
PROMYELOCYTIC fl 
LEUKEMIA, GENE ft 
REGULATION fl 


Compound 


ADRl; CHAIN: NULL; 


COMPLEX(TRANSCRIPTIO 
N REGULATION/DNA) 
TRAMTRACK PROTEIN 
(TWO ZINC-FINGER 


PEPTIDE) COMFLEXED 
WITH 2DRP 3 DNA 2DRP 4 


ZINC FINGER PROTEIN 
GLIl; CHAIN: A; DNA; 
CHAIN: CD; 




ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 
CHAIN: C, D; 


ZINC FINGER PROTEIN 
GLIl; CHAIN: A; DNA; 
CHAIN: C,D; 




PROMYELOCYTIC 
LEUKEMIA ZINC FINGER 
PROTEIN PLZF; CHAIN: A; 
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PDB annotation 


DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINCFINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL L 
, STRUCTURE, COMPLEX U 
(ZINC FINGER/DNA) Y 


COMPLEX (ZINC 
i FINGER/DNA) ZINC FINGER. 

PROTEIN-DNA 
1 INTERACTION, PROTEIN 
I DESIGN, 2 CRYSTAL 

STRUCTURE, COMPLEX 

(ZINCFINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINCFINGER/DNA) ti 


COMPLEX (ZINC fi 
FINGER/DNA) ZINC FINGER, , 
PROTEIN-DNA . T 
INTERACTION. PROTEIN 
DESIGN, 2 CRYSTAL 9 
STRUCTURE, COMPLEX 01 
(ZINCFINGER/DNA) Q 


COMPLEX (ZINC ft 
FINGER/DNA) ZINC FINGER,\ 
PROTBIN-DNA «! 
INTERACTION, PROTEIN £ 
DESIGN, 2 CRYSTAL F 
STRUCTURE, COMPLEX ft 
(ZINC FINGER/DNA) fl 


8 


Compound 




DNA; CHAIN: A,B,D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B f D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, Q\ 


DNA; CHAIN: A.B.D.E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


w 

Q 
PQ 

<d 

I 


SEQFOLD 
score 














98.26 


PMF 
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P 


COMPLEX (ZINC f} 
FINGER/DNA) ZINC FINGER^ 
PROTEIN-DNA ™ 
INTERACTION, PROTEIN £ 
DESIGN, 2 CRYSTAL f M 
STRUCTURE, COMPLEX wi 
(ZINC FINGER/DNA) Si 


e/oiss 


E: 




PDB annotation 


DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGE 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGE 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINQER/DNA) 


| COMPLEX (ZINC 
FINGER/DNA) ZINC FINGE 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGE 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


| COMPLEX (ZINC 




Compound 


■ 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C,F,G; 


W 
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SEQ FOLD 
score 














95.94 
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PMF 
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PDB annotation 


TRANSCRIPTION 
INITIATION, ZINC FINGER 
PROTEIN 


COMPLEX (TRANSCRIPTION 


YANG 1; TRANSCRIPTION . 
INITIATION, INITIATOR 
ELEMENT, YYl, ZINC 2 ij 
FINGER PROTEIN, DNA- V 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YING- 
YANG 1; TRANSCRIPTION 
INITIATION, INITIATOR 
ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YING- 
YANG1; TRANSCRIPTION m 
INITIATION, INITIATOR £ 
ELEMENT, YYl, ZINC 2 *J 
FINGER PROTEIN, DNA- p 
PROTEIN RECOGNITION, 3 \ 
COMPLEX (TRANSCRIPTIOlHj 
REGULATION/DNA) (W 


COMPLEX (TRANSOUPTIOQ 
REGULATION/DNA) YING- m 
YANG 1; TRANSCRIPTION \" 
INITIATION, INITIATOR 
ELEMENT, YYl, ZINC 2 P 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 m 
COMPLEX (TRANSCRlPnOm 
REGULATION/DNA) 


Compound 




YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
1 DNA; CHAIN: A, B; 


YYl; CHAIN: C; ADENO- , 
ASSOCIATED VIRUS P5 


INITIATOR ELEMENT 
DNA; CHAIN: A, B; 




YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


SEQFOLD 
score 
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PDB annotation 


FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GLI; GLI, ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING jk 
PROTEIN/DNA) FIVE- ||l 
FINGER GU; GLI, ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 




CALCIUM-BINDING 
PROTEIN 2A9,CACY, 
S100A6. PRA; CALCIUM- 
BINDING PROTEIN, EF- 
HAND, S-100 PROTEIN, NMR 


CAIXIUM/PHOSPHOLIPID 
BINDING PROTEIN PI 1, 
CALPACTIN UGHT CHAIN; 
S100 FAMILY, EF-HAND 
PROTEIN, LIGAND OF 
ANNEXING 2 *g 
CALCIUM/PHOSPHOLIPID n 
BINDING PROTEIN _i 


METAL BINDING PROTEIN *. ' 
S100B.S100BETA; ^ 
S100BETA, S100B, NMR, W 
DIPOLAR COUPLINGS, EF- 0? 
HAND, SlOO 2 PROTEIN, © 
CALCIUM- BINDING fy 
PROTEIN, FOUR-HEUX \ y 
BUNDLE, THREE- 3 q 
DIMENSIONAL STRUCHJREr 
SOLUTION STRUCTURE " 


CALCIUM-BINDING IV 
PROTEIN SNTNC; CALOUftfU 
BINDING, REGULATION. jftj 


Compound 




ZINC FINGER PROTEIN 
GLIl; CHAIN: A; DNA; 
CHAIN: C,D; 

■ 


ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 


CHAIN: C, D; 




jjjj « 

u u 


pa 
© 

1 

»— < 

CO 


S-100 PROTEIN, BETA 
CHAIN; CHAIN: A, B; 


1 N-TROPONIN C; CHAIN: ' 


NULL; 


SEQ FOLD 
score 










70.10 


87.95 


84.29 




if 

r*4 (a 






0.90 
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Verify 
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PDB annotation 


TROPONIN C, SKELETAL 
MUSCLE, 2 CONTRACTION 


METAL TRANSPORT 
CALMODULIN, HIGH ' 
RESOLUTION, DISORDER 


CALCIUM-BINDING 
CALCIUM-BINDING, ZINC, 
METAL-BINDING, 
ACETYLATION 








RECEPTOR RECEPTOR, 
SIGNAL TRANSDUCER OF 
IL-6 TYPE CYTOKINES, 
THIRD 2 N-TERMINAL 
DOMAIN, 

TRANSMEMBRANE, 
GLYCOPROTEIN 


RECEPTOR RECEPTOR, 
SIGNAL TRANSDUCER OF 
BL-6 TYPE CYTOKINES, 
THIRD 2 N-TERMINAL 
DOMAIN, 


TRANSMEMBRANE, 
GLYCOPROTEIN 


HORMONE/GROWTH 
FACTOR HORMONE, 
RECEPTOR, 
HORMONE/GROWTH 
FACTOR 


CONNECTIN A71, 
CONNECTIN; TITIN. 


Compound 




CALMODULIN; CHAIN: A; 


S-100 PROTEIN; CHAIN: 
NULL; 


1 CONTRACTILE SYSTEM 
PROTEIN TROPONIN C 
1TOP 3 


MUSCLE PROTEIN 
TROPONIN C(TR1C 
! FRAGMENT) (APO FORM) 
; (NMR, 1 STRUCTURE) 
| 1TRF3 




GP130;CmiN:NULL; 


GP130; CHAIN: NULL; 


GROWTH HORMONE; 
CHAIN: A; PROLACTIN 
RECEPTOR; CHAIN: B; 


TITIN; CHAIN: NULL; 


SEQFOLD 
score 






83.93 








51.99 




57.27 




PMF 
score 
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PDB annotation 


(CYTOKINEZRECEPTOR) 
EPOBP; ERYTHROPOIETIN, 


ERYTHROPOIETIN 
RECEPTOR, SIGNAL 2 
TRANSDUCTION, 
HEMATOPOIETIC 
CYTOKINE, CYTOKINE 
RECEPTOR 3 CLASS 1 , jj| 
1 COMPLEX O 


i 

J 
D 




CELL ADHESION PROTEIN 
RGD, EXTRACELLULAR 
MATRIX IFNF 18 


I HEPARIN AND INTEGRIN 1 


BINDING HEPARIN AND 
INTEGRIN BINDING 


1 HEPARIN AND INTEGRIN 1 


BINDING HEPARIN AND 
INTEGRIN BINDING 


CELL ADHESION PROTEIN ^ 
CELL ADHESION PROTEIN, ffi 
RGD, EXTRACELLULAR Vj 
MATRIX, 2 HEPARIN- T 1 
BINDING, GLYCOPROTEIN ,J 


CELL ADHESION PROTEIN '(Q 
CELL ADHESION PROTEIN, Off 


RGD, EXTRACELLULAR Q 
MATRIX, 2 HEPARIN- fy 
BINDING, GLYCOPROTEIN \ 


CELL ADHESION PROTEIN g 
CELL ADHESION PROTEIN,^ 
RGD, EXTRACELLULAR |f 
MATRIX, 2 HEPARIN- fU 
BINDING. GLYCOPROTEIN fU 
STRUCTURAL PROTEIN FU 


Compound 


CHAIN: A; 
ERYTHROPOIETIN 


RECEPTOR; CHAIN: B, C; 


i 

1 


FIBRONECTIN CELL- 
ADHESION MODULE TYPE 
IE-101FNA3 


FIBRONECTIN; IFNF 6 
CHAIN: NULL; IFNF 7 


FIBRONECTIN; CHAIN: A; 


1 FIBRONECTIN; CHAIN: A; 




FIBRONECTIN; CHAIN: 




FIBRONECTIN; CHAIN: 
NULL: 




t 

s 


NULL; 


INTEGRIN BET A-4 


SEQFOLD 
score 






92.46 


81.90 




60.33 




66.57 


PMF 1 


score 




0.69 






0.19 




0.07 


0.64 
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PDB annotation 


STRUCTURE, COMPLEX 
(ZINC FINGER/DN A) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX J 
(ZINC FINGER/DNA) 1 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINCFINGER/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YING- 
YANG 1; TRANSCRIPTION 
INITIATION, INITIATOR 
ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DN A- 
PROTELN RECOGNITION, 3 


COMPLEX (TRANSCRIPTiONy 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTI01|i 
REGULATION/DNA) YING- ^ 
YANG 1; TRANSCRIPTION _j 
INITIATION, INITIATOR m 
ELEMENT, YYl, ZINC 2 (SI 
FINGER PROTEIN, DNA- Q 
PROTEIN RECOGNITION, 3 m 
COMPLEX (TRANSCRIPTION.. 
REGULATION/DNA) ^ 


COMPLEX (TRANSCRlPTIOir 
REGULATION/DNA) YING- 
YANG 1; TRANSCRIPTION ft) 
INITIATION, INITIATOR fU 
ELEMENT, YYl, ZINC 2 HI 


Compound 




DNA; CHAIN: A, B, D, E; 
; CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 

i 
i 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A,B; 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 




















SEQFOI 
score 


















PMF 
score 
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Verify 
score 
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0.21 


Psi 
Blast 




CO 
CO 


6 

00 
VO 


i 

r-t 

in 


VO 
CO 




i ^ 




o\ 

J—i 






CO 
OA 
»— I 


o 

s 


START 
AA 




00 
VO 


CO 




CO 


m 


CHAIN 
ID 




0 


o 


u 


U 


U 


PQ 

gQ 




Imey 


! 


lubd 


lubd 


lubd 


SEQID 
NO: 




















5 






*<* 









546 



WO 02/081731 



PCT/US02/01222 



PDB annotation 


FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 








LIGASE CBL, UBCH7, ZAP- 
70, E2, UBIQUTTIN, E3, 
PHOSPHORYLATION, 2 
TYROSINE KINASE, 
UBIQUITINATION. PROTEIN 
DEGRADATION, 


METAL BINDING PROTEIN 
RING FINGER PROTEIN *g 
MATl; RING FINGER fj 
(C3HC4) Li 


DNA-BINDING PROTEIN ' 
V(D)J RECOMBINATION _I 
ACTIVATING PROTEIN 1; W 
RAG1,V(D)J yT J 
RECOMBINATION, Q 
ANTIBODY, MAD, RING fy 
FINGER, 2 ZINC B1NUCLEAR, 
CLUSTER, ZINC FINGER, ~ 
DNA-BINDING PROTEIN 7* 


Li 


KINASE KINASE, SIGNAL *M 
TRANSDUCTION, RJ 
CALCIUM/CALMODULIN f|J 


Compound 




COMPmX(TRANSCRIPTIO 
N REGULATION/DNA) 
TRAMTRACK PROTEIN 
(TWO ZINC-FINGER 
PEPTIDE) COMPLBXED 
WITH 2DRP 3 DNA 2DRP 4 




VIRUS EQUINE HERPES 
VIRUS-1 (C3HC4, OR RING 
DOMAIN) ICHC 3 (NMR, 1 
STRUCTURE) ICHC 4 


SIGNAL TRANSDUCTION 
PROTEIN CBL; CHAIN: A; 
ZAP-70 PEPTIDE; CHAIN: 
B; UBIQUITIN- 
CONJUGATING ENZYME 
E12-18 KDA UBCH7; 
CHAIN: C: 


CDK-ACTIVATING 
KINASE ASSEMBLY 
FACTOR MATl; CHAIN: A; 


C 

c 


! 
i 

> 




CALCIUM/CALMODULIN- 
DEPENDENT PROTEIN 
KINASE; CHAIN: NULL; 


SEQFOLD 
score 


















110.80 


PMF 
score 




0.03 




0.66 


0.48 


0.40 


0.05 






Verify 
score 




0.48 
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PDB annotation 


HETNAM I 


SERINE PROTEASE SERINE 
PROTEASE HEADER 
HETNAM 


SERINE PROTEASE SERINE 
PROTEASE, HYDROLASE, 
COMPLEMENT, FACTOR D, , 
CATALYTIC 2 TRIAD, SELF-' 11 
REGULATION 


BLOOD CLOTTING TSV-PA; 

FIBRINOLYSIS, 

PLASMINOGEN 
1 ACTIVATOR, SERINE 
! PROTEINASE, 2 SNAKE 

VENOM, COMPLEX 

(HYDROLASE/INHIBITOR), 
! BLOOD CLOTTING 


BLOOD CLOTTING TSV-PA; 
FIBRINOLYSIS, 
PLASMINOGEN 
ACTIVATOR, SERINE 
PROTEINASE, 2 SNAKE 
VENOM, COMPLEX 
(HYDROLASE/INHIBITOR), ^ 
BLOOD CLOTTING n 


§ 

g § Z 

pi 


INHIBITOR, SFECIFICiTY, M 
SERINE PROTEASE, 2 M7 
COMPLEX (SERINE Q 
PROTEASE/INHIBITOR) R| 


SERINE PROTEASE s. 
HYDROLASE, SERINE ^ 
PROTEASE, DIGESTION, W 
PANCREAS, ZYMOGEN, 2 f* 
SIGNAL, MULTIGENE fU 
FAMILY hi 


:E1 

3 

i 


Compound 




ALPHA THROMBIN; 
CHAIN: A,B,F,E; 


COMPLEMENT FACTOR D; 
CHAIN: NULL; 


PLASMINOGEN 
ACTIVATOR; CHAIN: A, B; 
GLU-GLY-ARG- 
CHLOROMETHYLKETONE 
INHIBITOR; CHAIN: E, F; 


PLASMINOGEN 
ACTIVATOR; CHAIN: A, B; 
GLU-GLY-ARG- 
CHLOROMETHYLKETONE 
INHIBITOR; CHAIN: E,F; . 


CATHEPSIN G; CHAIN: A; 
PHOSPHONATE 
INHIBITOR SUC-VAL-PRO- 
PHEP-(0PH)2; CHAIN: S; 


TRYPSIN; CHAIN: NULL; 


| ENTEROPEPTIDASE; 


SEQ FOLD 
score 






133.21 


154.29 




126.65 


165.05 


| 136.64 


PMF 
score 




m 

"0 
o 






OO'l 








Verify 
score 




-0.62 






0.69 
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« 


» - 
o ft 




Ibhx 


lbio 


Ibqy 
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lcgh 


ldpo 


lekb 
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PDB annotation 


fm* 




COMPLEX (SERINE 
PROTEASE/COAGULATION) 
COMPLEX (SERINE 
PROTEASE/COAGULATION), 
SERINE, PROTEASE. 2 
THROMBIN «E 


COMPLEX (SERINE f) 

protease/coagulationL 
complex (serine " 
protease/coagulation^ 

SERINE, PROTEASE, 2 Cfl 
THROMBIN Un 


COMPLEX (SERINE O 
PROTEASE/PEPTIDE) fy 
FIBRINOPEPTIDE- A, 
COMPLEX (SERINE g 
PROTEASE/PEPTIDE), 2 ^ 
THROMBIN f 5 * 


1 


Compound 


HYDROLASE (SERINE 
PROTEINASE) TRYPSIN 
(E.C.3.4.21.4) COMPLEXED 


WITH THE INHIBITOR 
ITRN 3 DHSOPROPYL- 
FLUOROPHOSPHOFLUORI 
DATE (DFP) ITRN 4 
HUMAN TRYPSIN, DFP 


o 


HYDROLASE (SERINE 
PROTEINASE) TRYPSIN 
fP. r.%A.JA A\ COMPLEXED 


WITH THE INHIBITOR 
ITRN 3 DHSOPROPYL- 
FLUOROPHOSPHOFLUORI 
DATE (DFP) ITRN 4 
HUMAN TRYPSIN, DFP 


VO 
^h 


THROMBIN; CHAIN: L, H, 
E.J,K,M,N; 
FIBRINOPEPTIDE A- 
ALPHA; CHAIN: F,G, I; 


THROMBIN; CHAIN: L, H. 
E,J,K,M,N; 


FIBKINOPEPTIDE A- 
ALPHA; CHAIN: F, G. I; 




ALPHA THROMBIN; 
CHAIN: L,H;EPSILON 
THROMBIN; CHAIN: J, K, 
M; FIBRINOPEPTIDE A- 
ALPHA; CHAIN: F, N; 


SERINB PROTEASE 
GAMMA-THROMBIN 2HNT 
3 


SEQ FOLD 
score 


177.94 












PMF 
score 




1.00 


0.58 


0.98 


0.59 


0.05 


Verify 
score 




0.69 


9 


0.11 


-0.47 


-0.39 


Psi 
Blast 


1.7e- 
100 


1.7e- 
100 


% 

i-H 


3.4e-34 


5.1C-32 


oo 
vb 






i-H 


oo 

S3 
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oo 
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START 
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00 
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l-H 
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§e 

PL, 




l-H 


lucy 


lucy 


lycp 


2hnt 


SEQ ID 
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PDB annotation 












SERINE PROTEASE 
HYDROLASE, SERINE 
PROTEASE, DIGESTION, 
PANCREAS, 2 ZYMOGEN, 
SIGNAL "5 


SERINE PROTEASE n 
HYDROLASE, SERINE i 
PROTEASE, DIGESTION, ~ 
PANCREAS, 2 ZYMOGEN, _^ 
SIGNAL Ml 


3 5 




IMMUNE SYSTEM fU 
CATALYTIC ANTIBODY, fll 


Compound 


SERINE PROTEINASE 

KALLIKREINA 

(E.C.3.4.21.8)2PKA4 


SERINE PROTEINASE 

KALLIKREINA 

(E.C3.4.21.8)2PKA4 


HYDROLASE(SERINE 
PROTEINASE) TRYPSIN 
(E.C.3.4.21.4) COMPLEXED 
WITH BENZAMIDINE 
INHIBITOR 2TB S 3 


HYDROLASE(SERINE 
PROTEINASE) TRYPSIN 
(E.C.3.4.21.4) COMPLEXED 
WITH BENZAMIDINE 
INHIBITOR 2TB S 3 


SERINE PROTEINASE RAT 
MAST CELL PROTEASE 
/H$ r/RMCPII$) 3RP24 


1 BETA TRYPSIN; CHAIN: j 


NULL; 


1 BETA TRYPSIN; CHAIN: 


NULL; 




COMPLEX 

(ANTIBODY/ANTIGEN) 
HYHEL-5FAB 
-COMPLEXED WITH 
BOBWHTTE QUAIL 
LYSOZYME IBQL 3 IBQL 
95 


CATALYTIC ANTIBODY 
1E9 (LIGHT CHAIN); 


SEQFOLD 
score 






168.64 




116.86 


170.27 










PMF 
score 
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Verify 
score 
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DlELS-ALDiiK, 
IMMUNOGLOBULIN 


IMMUNOGLOBULIN 
IMMUNOGLOBULIN, FAB 
COMPLEX, IDIOTOPE, ANTI- 
IDIOTOPE 


IMMUNOGLOBULIN, FAB 
COMPLEX, IDIOTOPE, ANTI- 
IDIOTOPE 


WILLEBRAND FACTOR, 
GLYCOPROTEIN IBA 
(A: ALPHA) BINDING, 2 
COMPLEX a** 


5 g 
g £ 

^ O 55 % 

isli 

ii 

&So>p 




iSi: 


Compound 


CHAIN: L; CATALYTIC 
ANTIBODY 1E9 (HEAVY 
CHAIN); CHAIN: H; 


o o o 

£5 A £5 


2 S 2 


IMMUNOGLOBULIN NMC- 
4IGG1;CHAIN:L; 
IMMUNOGLOBULIN NMC- 
4IGG1;CHAIN: H;VON 
WTLLEBRAND FACTOR: 


CHAIN: A; 


MMUNOGLOBULiN/VIKU 
S HEMAGGLUTININ 
IGG2A FAB FRAGMENT 
(FAB 26/9) COMPLEXED 
WITH INFLUENZA IFRG 3 
HEMAGGLUTININ HAl 
(STRAIN X47) (RESIDUES 
101-108) IFRG 4 


(IMMUNOGLOBULIN FAB 
1 FRAGMENT OF 


IG HEAVY CHAINV 
REGIONS; CHAIN: A; 
HEAVY CHAINV 
REGIONS; CHAIN: B; 
HEAVY CHAINV 
REGIONS; CHAIN: C; 
HEAVY CHAINV 
REGIONS; CHAIN: D; 


IG HEAVY CHAIN V 
REGIONS; CHAIN: A; 
HEAVY CHAINV 
REGIONS; CHAIN: B; 
HEAVY CHAINV 
REGIONS; CHAIN: C; 
HEAVY CHAINV 
REGIONS; CHAIN: D 
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PDB annotation 


FOLD, GLYCOPROTEIN 1 


CELL ADHESION NCAM; 
NCAM, IMMUNOGLOBULIN 
FOLD. GLYCOPROTEIN 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF2; 
FGFR2; IMMUNOGLOBULIN J 
(IG)LIKE DOMAINS f] 
BELONGING TO THE I-SET 2-1! 
SUBGROUP WTTHIN IG-LIKE 
DOMAINS, B-TREFOIL FOLD 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF2; 
FGFR2; IMMUNOGLOBULIN 
(IG)LIKE DOMAINS 
BELONGING TO THE I-SET 2 
SUBGROUP WITHIN IG-LIKE 
DOMAINS, B-TREFOIL FOLD 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF2; 
FGFR2; IMMUNOGLOBULIN 
(IG)LIKE DOMAINS 
BELONGING TO THE I-SET 2 
SUBGROUP WTTHIN IG-LIK^ 
DOMAINS, B-TREFOIL FOLD* 


GROWTH FACTOR/GROWTH, 
FACTOR RECEPTOR FGFl ; P 
FGFRl; IMMUNOGLOBULIN^ 
(IG) LIKE DOMAINS (fl 
BELONGING TO THE I-SET ®F 
SUBGROUP WTTHIN IG-LKp 
DOMAINS. B-TREFOIL FOljEj 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGFl; ^ 
FGFRl; IMMUNOGLOBULIN 
(IG) LIKE DOMAINS F* 
BELONGING TO THE I-SETRJ 
SUBGROUP WTTHIN IG-LIMffl 
DOMAINS, B-TREFOIL FOLj^j 


Compound 


Q 
U 


NEURAL CELL ADHESION 
MOLECULE; CHAIN: A, B, 
C,D; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, C, 
D; FIBROBLAST GROWTH 
FACTOR RECEPTOR 2; 
CHAIN: E,F,G,H; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, C, 
D; FIBROBLAST GROWTH 
FACTOR RECEPTOR 2; 
CHAIN: E,F,G,H; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, C f 
D; FIBROBLAST GROWTH 
FACTOR RECEPTOR 2; 
CHAIN: E,F.G,H; 


FIBROBLAST GROWTH 
FACTOR 1; CHAIN: A.B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; 
CHAIN: C,D; 


FIBROBLAST GROWTH 
FACTOR 1; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; 
CHAIN: CD; 
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PDB annotation 


PROTEASE j 


MULTICATALYTIC 
PROTEINASE 
MULTICATALYTIC 
PROTEINASE, 20S 
PROTEASOME. PROTEIN 2 
DEGRADATION, ANTIGEN 
PROCESSING. HYDROLASE, M 
PROTEASE M 


MULTICATALYTIC 
PROTEINASE 
MULTICATALYTIC 
PROTEINASE. 20S 
PROTEASOME. PROTEIN 2 
DEGRADATION, ANTIGEN 
PROCESSING. HYDROLASE, 
PROTEASE 


MULTICATALYTIC 
PROTEINASE 
MULTICATALYTIC 
PROTEINASE, 20S 
PROTEASOME, PROTEIN 2 
DEGRADATION, ANTIGEN 
PROCESSING, HYDROLASE^ 
PROTEASE f*\ 


MULTICATALYTIC Vj 
PROTEINASE f 
MULTICATALYTIC ^ 
PROTEINASE, 20S V 
PROTEASOME, PROTEIN 2 (JJ 
DEGRADATION, ANTIGEN p 
PROCESSING, HYDROLASES! 
PROTEASE \r 


MULTICATALYTIC * 
PROTEINASE P 
MULTICATALYTIC M 1 
PROTEINASE, 20S RJ 
PROTEASOME, PROTEIN 2 raj 
DEGRADATION. ANTIGEN py 


Compound 




o o . 


O u - 
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PDB annotation 


DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 

INTERACTION, PROTEIN - 
DESIGN, 2 CRYSTAL fjl 
STRUCTURE, COMPLEX *P 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE. COMPLEX 
(ZINC FINGER/DNA) 13 


COMPLEX (ZINC p 
FINGER/DNA) ZINC FINGER.} 
PROTEIN-DNA 7* 
INTERACTION, PROTEIN \ , 
DESIGN, 2 CRYSTAL <|] 
STRUCTURE, COMPLEX {/} 
(ZINC FINGER/DNA) Q 


FINGER/DNA) ZINC FINGER. 
PROTEIN-DNA 2 
INTERACTION, PROTEIN j*f 
DESIGN, 2 CRYSTAL r* 
STRUCTURE, COMPLEX fU 
(ZINC FINGER/DNA) "flj 
COMPLEX (ZINC fjj 


Compound 




DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
j PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C. F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C F, G; 


DNA; CHAIN: A, B, D, E; 


SEQ FOLD 
score 
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PDB annotation 


FINGER/DNA) ZIWU WJNUHK., 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 1 
FINGER/DNA) ZINC FINGER, M 
PROTEIN-DNA ^11 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLfciX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


PCI.' 

*lllggg 


w 

S 5 * 

iiiiili! 


*a i 

P * 

Isles 


Compound 


CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


TRANSCRIPTION FACTOR 
IDA; CHAIN: A; 5S RNA 
GENE; CHAIN: E, F; 


J FOLD 
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PDB annotation 

T*F.r,T JLATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YING- 
YANG 1; TRANSCRIPTION 
INITIATION, INITIATOR 
ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 M 
COMPLEX (TRANSCRIPTION^ 
REGULATION/DNA) 


COMPLEX (TKAINSUKU'A IUJN 
REGULATION/DNA) YING- 
YANG 1; TRANSCRIPTION 
INITIATION, INITIATOR 
! ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 1 


COMPLEX (TRANoUKlJr 1 UJIN 

REGULATION/DNA) YING- 
YANG 1; TRANSCRIPTION 
INITIATION, INITIATOR 
ELEMENT, YYl, ZINC 2 *g 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 \J 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) \ 


I 

z S§ i 

O / o . 
u < u < 


OJLBEIEJ 

1 

D O * 
J < 0 < & 


Compound 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


YYl ; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 




ADRl; CHAIN: NULL; 


ADRl; CHAIN: NULL; 
COMPLEX(TRANSCRIPTIO 


SEQFOLD 
score 
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PDB annotation 






( 

i 


COMPLEX (DNA-JB1NU1NU 
PROTEIN/DNA) FIVE- 
FINGER GLI; GU, ZINC 
FINGER. COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDINU 
PROTEIN/DNA) FIVE- °Q 
FINGER GLI; GLI, ZINC jfj 
FINGER, COMPLEX (DNA- \, 
BINDING PROTEtN/DN A) 7 1 


SIGNALING PROTEIN W 
PHOTORECEPTOR, G iff 
PROTEIN-COUPLED Q 
RECEPTOR, MEMBRANE fy 
PROTEIN, 2 RETINAL 
PROTEIN, VISUAL PIGMENg 




Compound 


NREGULATION/DNA) 1 
TRAMTRACK PROTEIN 
(TWO ZINC-FINGER 
PEPTIDE) COMPLEXED 
WITH 2DRP 3 DNA 2DRP 4 


COMPLEX(TRANSCRIPTIO 
NREGULATION/DNA) 
TRAMTRACK PROTEIN 
(TWO ZINC-FINGER 
PEPTIDE) COMPLEXED 
WITH 2DRP 3 DNA 2DRP 4 


COMPLEX(TRANSCRIPTIO 
N REGULATION/DN A) 
TRAMTRACK PROTEIN 
(TWO ZINC-FINGER 
PEPTIDE) COMPLEXED 
WITH 2DRP 3 DNA 2DRP 4 


ZINC FINGER PROTEIN 
GUI: CHAIN: A; DNA; 


CHAIN: C, D; 




ZINC FINGER PROTEIN 
GLI1; CHAIN: A; DNA; 
CHAIN: C,D; 




RHODOPSIN; CHAIN: A, B 


GROWTH FACTOR ACIDIC 
FIBROBLAST GROWTH 
FACTOR (AFGF) MUTANT 
WITH CYS 47 1AFC3 


SEQFOLD 
score 
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*n 
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in 
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c 
c 

c 

i 


\ 
\ 

i 

3 / 
if 














I GROWTH FACTOR FGF-2; jpj j 


Compound 


REPLACED BY ALA (C47A) 
COMPLEX WITH SUCROSE 
0CTASULFATE1AFC4 


GROWTH FACTOR ACIDIC 
FIBROBLAST GROWTH 
FACTOR (AFGF) MUTANT 
WITH CYS 47 1 AFC 3 
REPLACED BY ALA (C47 A) 
COMPLEX WITH SUCROSE 
OCTASULFATfe 1 AFC 4 


GROWTH FACTOR ACIDIC 
FIBROBLAST GROWTH 
FACTOR (AFGF) MUTANT 
WITH CYS 47 1BAR3 
REPLACED BY ALA AND 
HIS 93 REPLACED BY GLY 
(C47A393G)1BAR4 


GROWTH FACTOR ACIDIC 
FIBROBLAST GROWTH 
FACTOR (AFGF) MUTANT 
WITH CYS 47 IB AR 3 
REPLACED BY ALA AND 
HIS 93 REPLACED BY GLY 
(C47AJI93G)1BAR4 


GROWTH FACTOR BASIC 
FIBROBLAST GROWTH 
FACTOR MUTANT WITH 
CYS 69 REPLACED IBFG 3 
BY SER AND CYS 87 
REPLACED BY SER 
(C69S.C87S) IBFG 4 


GROWTH FACTOR BASIC 
FIBROBLAST GROWTH 
FACTOR MUTANT WITH 
CYS 69 REPLACED IBFG 3 
BY SER AND CYS 87 
REPLACED BY SER 
(C69S,C87S) IBFG 4 


I BASIC FIBROBLAST I 


SEQFOLD 
score 






55.74 




56.94 




I 67.74 




score 




© 




so 
d 




VD 

d 




Verify 
score 




00 

co 
d 




vo 
*0 
o 




VO 

d 




Psi 
Blast 




CO 

I 

00 

vd 


CO 

*? 


CO 

■s 


3.4e-36 


VO 
CO 

* 

CO 


On 

•2 






CO 


ft 


CO 


VO 


p 


»— < 


START 
AA 








cs 

CO 


OO 
CO 








a 




< 




n 












03 


S3 

X> 


•o 


x> 

•—I 


00 


5 


SEQ ID 
NO: 




CO 


3 


5 


5 


CO 


5 
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PDB annotation 


GROWTH FACTOR 


GROWTH FACTOR FGF-2; 
GROWTH FACTOR 


HORMONE/GROWTH 
FACTOR BETA-TREFOIL 


HORMONE/GROWTH 1 
FACTOR BETA-TREFOIL 1 


HORMONE/GROWTH 
FACTOR BETA-TREFOIL, 
HORMONE/GROWTH 
FACTOR 


GROWTH FACTOR AFG; 
2AFG6 


Compound 


GROWTH FACTOR; 
CHAIN: NULL; 


BASIC FIBROBLAST 
GROWTH FACTOR; 
CHAIN: NULL; 


FIBROBLAST GROWTH 
FACTOR 7; CHAIN: A. B; 


FIBROBLAST GROWTH 
FACTOR 7; CHAIN: A, B; 


FIBROBLAST GROWTH 
FACTOR 7/1 CHIMERA; 
CHAIN: A; 


ACIDIC FIBROBLAST 
GROWTH FACTOR; 2 AFG 4 
CHAIN: A, B, C, D; 2AFG 5 | 


SEQFOLD 
score 














PMF 
score 




0.03 


0.76 


53 
© 


0.75 


0.66 


Verify 
score 




0.11 


0.40 


0.29 


0.62 


0.49 


Psi 
Blast 




OA 

co • 
• 


I 
co 


m 
d) 
»n 


r- 
m 
i 

o 
«r> 


CO 
CO 
1 

o 

CN 


§ < 

a < 




*— « 


<N 




cn 
r- 

«— 4 


CO 


START 
AA ' 




SO 


OS 
CO 


CO 


On 
CO 


m 
*<* 


CHAIN 
ID 






< 




< 


< 


u 




lbla 


cr 
cr 


cr 
cr 


Iqql 


2afg^ 


SEQID 
NO; 




CO 




5 


CO 
Tf 


CO 



/ 
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Table6 



SEQJD NO: 


Position of Signal 
Peptide 


Maximum score 


Mean score 


1 


24 


0.978 


0.760 


2 


32 


0.995 


0.681 


3 


37 


0.979 


0.718 


4 


18 


0.925 


0.822 


5 


28 


0.939 


0.749 


6 


41 


0.989 


0.690 


1 


26 


0.960 


0.674 


8 


16 


0.973 


0.925 


9 


24 


0.978 


0.760 


10 


18 


0.887 


0.579 


11 


42 


0.977 


0.587 


12 


21 


0.966 


0.848 


13 


25 


0.993 


0.954 


14 


28 


0.909 


0.664 


16 


23 


0.913 


0.597 


! 17 


42 


0.978 


0.689 


18 


21 


0.930 


0.662 


19 


45 


0.985 


0.714 


20 


37 


0.992 


0.855 


21 


31 


0.947 


0.775 


22 


20 


0.979 


0.911 


24 


30 


0.924 


0.720 


25 


26 


0.974 


0.824 


26 


28 


0.982 


0.649 


28 


16 


0.912 


0.705 


29 


27 


0.957 


0.652 


30 


22 


0.968 


0.844 


31 


23 


0.952 


0.812 


32 


18 


0.932 


0.884 


33 


29 


0.991 


0.729 


34 


26 


0.939 


0.709 


35 


29 


0.961 


0.842 


36 


16 


0.951 


0.777 


37 


27 


0.983 


0.898 


38 


17 


0.991 


0.955 


39 


33 


0.977 


0.822 


40 


17 


0.989 


. 0.969 


41 


30 


0.936 


0.679 


42 


24 


0.993 


0.810 


44 


22 


0.990 


0.921 


54 


18 


0.925 


0.822 


56 


18 


0.981 


0.951 


60 


28 


0.939 


0.749 


62 


33 


0.979 


0.757 


70 


41 


0.989 


0.690 


79 


26 


0.960 


0.674 


83 


18 


0.979 


0.963 


84 


22 


0.967 


0.792 


87 


25 


0.980 


0.867 


97 


16 


0.973 


0.925 


98 


24 


0.978 


0.760 


99 


17 


0.978 


0.925 
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SEQ JD NO: 


Position of Signal 
Peptide 


Maximum score 


Mean score 


113 


18 


0.887 


0.579 


115 


18 


0.952 


0.670 


120 


42 


0.977 


0.587 


137 


21 


A t\£.£. 

0.966 


A OjIO 

0.848 


140 


25 


0.993 


0.954 


153 


28 


0.909 


0.664 


156 


18 


0.954 


0.747 


174 


23 


0.913 


0.597 


175 


20 


0.986 


0.936 


178 


42 


0,978 


0.689 


180 


32 


0.929 


0.583 


184 


21 


0.979 


0.941 


192 


21 


0.930 


0.662 


200 


45 


0.985 


0.714 


212 


37 


0.992 


0.855 


225 


24 


0.971 


0.882 


228 


20 


0.979 


0.911 


237 


17 


0.982 


0.964 


251 


13 


0.918 


0.692 


252 


13 


0.918 


0.692 


256 


20 


0.912 


0.693 


257 


20 


0.912 


0.693 


260 


26 


0.974 


0.824 


262 


18 


0.965 


0.833 


267 


25 


0.956 


0.765 


288 


16 


0.912 


0.705 


289 


18 


0.896 


0.634 


290 


19 


0.966 


0.897 


294 


18 


0.991 


0.973 


295 


20 


0.906 


0.580 


299 


27 


0.957 


0.652 


307 


19 


0.983 


0.871 


310 


22 


0.968 


0.844 


320 


23 


0.952 


01812 


324 


27 


0.982 


A A* 1 

0.9 ll 


327 


18 


0.983 


0.941 


328 


18 


0.932 


0.884 


• 332 


27 


0.990 


A ATI 

0.923 


335 


45 


0.983 


0.793 


no 






U. / 7J 


346 


29 


0.991 


0.729 


354 


22 


0.978 


0.877 


363 


26 


0.939 


0.709 


364 


22 


0.966 


0.843 


375 


29 


0.961 


0.842 


379 


16 


0.951 


0.777 


401 


44 


0.975 


0.876 


407 


33 


0.977 


0.822 


417 


17 


0.989 


0.969 


418 


23 


0.974 


0.799 


422 


18 


0.981 


0.952 


426 


21 


0.982 


0.912 
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SEQ JD NO: 


Position of Signal 
Peptide 


Maximum score 


Mean score 


428 


30 


0.936 


0.679 


429 


43 


0.978 


0.712 


433 


28 


0.993 


0.948 


434 


43 


0.930 


0.624 


437 


24 


0.993 


0.810 


438 


16 


0.978 


0.939 
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Table 7 



SEQ ID NO: 


Chromsomal location 


3 


2qlL2 


4 


20pter-pl2.3 


5 


5q31 


6 


19pl2 


7 


19pl2 


8 


5 


11 


12pl3-pl2 


12 


pi 1.2-12.3 


13 


19p 


14 


6pl2.1-21.1 


15 


19pl3.1 


17 


16ql2-ql3 


19 


15 


20 


15 


22 


Xql3.1 f 


23 


12 


25 


llpl5-5 


26 


20 


27 


22 


28 


12q23-24.1 


29 


20 


30 


13 


31 


12 


33 


15 


36 


4q28 


37 


14q24.3 


38 


10 


39 


20 


41 


L 17ql2-q21 


42 


14 


44 


lq24.1-25.2 


45 


2 


47 


3q21-q25 


48 


9 


49 


14 


50 


6ql4.M5 


51 


19 


52 


11 


53 


20 


54 


16 


55 


14 


56 


3 


57 


19 


58 


7pl5.1-pl3 


59 


19 


61 


2 


62 


19 


63 


16 


66 


15 


70 


1P31.1-33 


71 


9 


72 


16 
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Table 7 



SEQ ID NO: 


Chromsomal location 


74 


5q31-q33 


75 


3p21.1-ql3.13 


76 


2 


77 


2 


78 


21q22.1 


79 


Xpll.22-pll.21 


80 


2 


81 


19 


82 


20 


83 


19pl3.3 


84 


19 


85 


3 


86 


8 


87 


lp!3 


88 


16 


89 


18q21.1-q22 


90 


Ilql3.1-ql3.3 


91 


18pll.23-pll.21 


92 


17 


93 


10 


94 


3 


95 


X 


96 


6ql4.2-16.1 


97 


lq21.2-22 


98 


lq21.2-22 


99 


6 ] 


102 


8q22-q23 


103 


10pll.2 


104 


17 


105 


17 


106 


2 


107 


1 


108 


16 


109 


17q21.3-q22 


110 


Hq 


! Ill 


3p21.1-ql3.13 


112 


16 


113 


5 


114 


9 


115 


3pl3-q26.1 


116 


5 


i 117 


7q31 


118 


1 14 


119 


14 


120 


19 


121 


19 


122 


6q27 


123 


14 


124 


Iq21-q22 


125 


6 


126 


17q25 


127 


15 



594 



WO 02/081731 



PCT7US02/01222 



Table 7 



ill INUI 


l^Hr OXUMJlllnJ lUCUUUU 




IHljj 1 


1 if\ 
130 




131 


I f 
1 1 


1 n 
132 


on 
zu 


133 


orv»i 1 nil 01 


134 


ipjz 


135 


Z<pl 


136 


v* 
A 


138 


lzpli 


139 


ft 

y 


140 


p34. 1-34.3 


141 


19qlz 


142 


lDqzo 


143 


22qll.2l 


144 


17ql2 


145 


4pl6.3 


[ 146 


22 


147 


16pll.2 


148 


18ql2 


150 


4 


151 


7pl2-<jll.zl 


152 


1 A 

14 


153 


1 Anil 11 


155 


lp34 


156 


i/:-Ti 1 
lopl J.J 


157 


lzplj.j 


158 


c 

J 


159 


Q 
O 


160 


IV 


161 


A 

4 


162 


1 
1 


163 


1 lCjZ^ 


164 


1 
D 


165 


lzqzz 


168 


10 


170 


1 


171 


loqlz 1 




7 


174 


13 


1 /3 


zpzj.3~<jj^» J 


176 


16 


| 178 


10 


179 


Iq21-q25 


180 


19pl3.3 


181 


1 


184 


lp35.l-36.23 


185 


1 


186 




187 


3pl3-q26.1 


188 


f 3 


189 


17 


190 


6 
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Table 7 



SEQ ID NO: 


Chromsomal location 


193 


llpl5J 


194 


14q32 


195 


12 


196 


10q24 


198 


lp36.1 


199 


5q22 


200 


11 


201 


2q31 


202 


17 


206 


Xpll.23 


207 


9q34 


208 


19 


209 


20 


210 


llq23 


211 


16pl2 


212 


19ql3.1 


213 


7pl5 


214 


15 


215 


lp36.21-36.33 


216 


11 


217 


22qlL2 


218 


15 


219 


19ql3.4 


222 


19 


223 


lq25.2 


226 


1 


227 


lp36. 11-36.23 


228 


Ip36.3-p36.13 


230 


17 


231 


7q33-q34 


232 


3 


233 


9 


234 


10 


235 


17 


236 


4 


237 


19ql3.4 


238 


4q25 


239 


2 


I 240 


7 


241 


12 


243 


6p2L3 


244 


3pl3-q26.1 


245 


17 


246 


lp34.1 


| 247 


3q23 


1 248 


3p21.3 


249 


20 


250 


20 


251 


18ql2-q21 


252 


18ql2-q21 


253 


f 14 


254 


Ip35.3-p35.1 
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Table 7 



SEQ ID NO: 


Chromsomal location 


256 


6q25-q26 


257 


6q25-q26 


258 


Iq21-q23 


259 


16pl3.2-16pl3.11 


260 


14q21.1-<j24.1 


261 


2p23.3-q323 


262 


12 


263 


19 


264 


4q28 


265 


2 


266 


2 


267 


Iq21-q23 


268 


20pl2.3-pl3 


269 


4 


270 


6 


271 


2p23.3-ql4.3 


272 


18q21 


273 


18q21 


274 


14q22 


275 


6p2L3 


276 


5 


280 


8 


281 


4q22-q24 


282 


2 


283 


7q22-q31.1 


284 


11 


285 


llql23 


286 


10 


287 


19 


290 


17 


291 


4q22 


292 


lp36. 11-36.23 


293 


19 


294 


22 


296 


3 


297 


4pl6 


298 


6 


299 


8ql3 


300 


20 


301 


15 


302 


22qll.2-q22 


303 


15 


304 


6 


306 


6 


307 


9p24.2 


308 


2p23.3-q24.3 


309 


14 


310 


6 


311 


2 


312 


4 


313 


19pter-19pl3.3 


314 


3 
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Table 7 



atAJ ID iS%Jt 


V^UTUiUMJ UUU iutauuu 


Il£ 

31D 


11nl2-14 2 1 


31 I 


19 


11 Q 
31o 


17 


1 1 O. 


17 ! 


320 


Snld 

JUiH 


323 




324 


■sp 


one 

32S 


£ n 91 i.oi 11 I 


326 


17r»1 1 9 5 


327 


Q 


328 


^n91 


329 


It 


330 


i 

D \ 


331 


1»»0 1 1 OO 1 

IpZl.l-ZZ.I 


332 


a 


333 


/ ! 


334 


liqi3 


337 


14 


338 


7q3j-q3o 


r 339 


13 


340 


oqil.i-z^.jo _j 


341 


Uql2-ql3.i 


343 


in 

1U 


344 


ID ! 


345 


1£ 
10 


346 


1 1 (JZZ i 


347 


10 


O AO 

348 




350 


Yn1 1 01.1 1 99 
Apl 1.41-1 1.44 


i 1 C A 

! 354 


AO 


i ICC 

355 


10 


356 


1 1 1 
1 1 


ICO 

358 


Yr»1 1 9^ 
Apl 1*43 


o CO 

359 


*r } 


360 


Q 

a j 


362 


*r 


363 


1 1 
11 


364 


1 11] JO 


365 




366 


22ql3.31-13.32 


367 


5 ? 


370 


19 


! 371 


7q31.1-7q31.33 


372 


2q37.3 


373 


3 


374 


16 


375 


19ql3.4 


376 


18ql2 


377 


18ql2 


379 


8 


380 


imu 


381 


6 
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SEQ ID NO: 


Chromsomal location 


385 


4q28 


386 


15 


| 387 


10 


388 


17 


389 


llpl5.4 


390 


6p21.3 


391 


22ql3 


392 


3 


393 


19 


394 


15 


395 


1 


396 


6p21.2-p2L3 


397 


15 


399 . 


7q31 


400 


14 


402 


Xq28 


403 


10 


404 


16 


406 


16 


408 


11 


412 


20ql2-13.1 


413 


15 


414 


17 


415 


4 


416 


12q 


419 


21q22.1 


420 


16pll.2 


422 


6 


424 




426 


14 


428 


14 


429 


Iq22-q23 


430 


llql3 


431 


3 


432 


2 


433 


19ql3.1 


434 


20ql3.1 


435 


18q23 


436 


llq24 


437 


10 


438 


4q21-q25 
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Table 8 



SEQ ID NO: of Full-length 
Nucleotide Sequence 


SEQ ID NO: of Full-length 
Nucleotide Sequence 


SEQ ID NO: in Priority Application 
USSN 09/774,528 


52 


52 


54 


53 


53 


55 


54 


54 


56 


55 


55 


57 


56 


56 


58 


57 


57 


59 


58 


58 


60 


59 


59 


61 


60 


60 


62 


61 


61 


63 


62 


62 


64 


63 


63 


65 


64 


64 


66 


65 


65 


67 


66 


66 


68 


67 


67 


69 


68 


68 


70 


69 


69 


71 


70 


70 


72 


71 


71 


73 


72 


72 


74 


73 


73 


75 


74 


74 


76 


75 


75 


77 


76 


76 


78 


77 


77 


79 


78 


78 


80 


79 


79 


81 


80 


80 


82 


81 


81 


83 


82 


82 


84 


83 


83 


85 


84 


84 


86 


85 


85 


87 


86 


86 


88 


87 


87 


89 


88 


88 


.90 


89 


89 


91 


90 


90 


92 


91 


91 


93 


92 


92 


94 


93 


93 


95 


94 


94 


96 


95 


95 


97 


96 


96 


98 


97 


97 . 


99 


98 


98 


100 


99 


99 


101 


100 


100 


102 


101 


101 


103 


102 


102 


104 


103 


103 


105 
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SFO m NO* nf Fiill-lenpth 

0£r\JJ LU I^Vr* Ul riill Ivllglll 


SEO ID NO: of Full-length 
Nucleotide Seauence 


SEQ ID NO: in Priority Application 
USSN 09/774,528 


104 

ivt 


104 


106 


105 


105 


107 


in/: 

1UO 


106 


108 


IV/ 


107 


109 


i 10ft 


10R 

1 vo 


110 


ioq 


100 


1 1 1 


1 iu 


110 

1 1U 


1 17 


iii 
in 


111 


1 1 1 

i i j 


1 1 o 

1 IZ 


1 17 
1 IZ 


1 14 
i it 


in 
1 1 J 


117 
1 1 J 


1 1 5 


ii/i 

1 14 


1 1 A 
1 l*t 


1 16 

1 1U 


11C 

1 ID 


1 1 5 
1 1 j 


1 17 
11/ 


Ho 


1 1 /* 


1 151 


in 

1 11/ 


I 1 7 * 

II / N 


1 10 
1 if 


Ho 


115 


1 70 
IZv 


1 1 Q 

1 iy 


1 1 Q 
1 ly 


171 

1 Z 1 


1 OA 
1ZU 


1 on 
1ZU 


1 77 
1ZZ 


I'll 
1Z1 


10 I 
1Z1 


177 

IZj 


1 00 

1ZZ 


100 
1ZZ 


174 

1 ZH 


1 07 

! IZj 


107 
IZj 


175 

IZj 


1 OA 
1Z*+ 


174 
iz*t 


1 £>V 


105 
IZj 


175 
IZj 


177 


1Z0 


176 

1ZO 


17R 
1 zo 


1 0*7 
IZ/ 


177 
IZ/ 


170 


108 
IZO 


17R 


110 


1 00 

izy 


170 


111 
I J 1 


1 70 


110 


117 


1 jl 


171 
1j 1 


1 77 


110 
1 jZ 
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WHAT IS CLAIMED IS: 

1. An isolated polynucleotide comprising a nucleotide sequence selected from the 
group consisting of SEQ ID NO: 1-438, a mature protein coding portion of SEQ ID NO: 
1-438, an active domain coding portion of SEQ ID NO: 1-438, and complementary 
sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide hybridizes to the polynucleotide of claim 1 under stringent 
hybridization conditions. 

3 . An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide has greater than about 90% sequence identity with the 
polynucleotide of claim 1. 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1. 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 
operatively associated with a regulatory sequence that modulates expression of the 
polynucleotide in the host cell. 

10. An isolated polypeptide, wherein the polypeptide is selected from the group 
consisting of: 
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(b) 



(a) 



a polypeptide encoded by any one of the polynucleotides of 
claim 1; and 

a polypeptide encoded by a polynucleotide hybridizing under 
stringent conditions with any one of SEQ ID NO: 1-438. 



11. A composition comprising the polypeptide of claim 1 0 and a carrier. 

12. An antibody directed against the polypeptide of claim 10. 

13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a 
complex with the polynucleotide of claim 1 for a period sufficient to form the complex; 
and 

b) detecting the complex, so that if a complex is detected, the 
polynucleotide of claim 1 is detected. 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: . 

a) contacting the sample under stringent hybridization conditions 
with nucleic acid primers that anneal to the polynucleotide of claim 1 under such 
conditions; 

b) amplifying a product comprising at least a portion of the 
polynucleotide of claim 1 ; and 

c) detecting said product and thereby the polynucleotide, of claim 1 in 

the sample. 

1 5. The method of claim 14, wherein the polynucleotide is an RNA molecule and the 
method further comprises reverse transcribing an annealed RNA molecule into a cDNA 
polynucleotide. 

» 

16. A method for detecting the polypeptide of claim 1 0 in a sample, comprising: 
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a) contacting the sample with a compound that binds to and forms a 
complex with the polypeptide under conditions and for a period sufficient to form the 
complex; and 

b) detecting formation of the complex, so that if a complex formation 
is detected, the polypeptide of claim 10 is detected 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound 
complex is detected, a compound that binds to the polypeptide of claim 10 is identified. 

18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a 
cell, under conditions sufficient to form a polypeptide/compound complex, wherein the 
complex drives expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence 
expression, so that if the polypeptide/compound complex is detected, a compound that 
binds to the polypeptide of claim 1 0 is identified. 

19. A method of producing the polypeptide of claim 10, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected 
from SEQ ID NO: 1-438, a mature protein coding portion of SEQ ID NO: 1-438, an 
active domain coding portion of SEQ ID NO: 1-438, complementary sequences thereof 
and a polynucleotide sequence hybridizing under stringent conditions to SEQ ID NO: 1- 
438, under conditions sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 
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20. An isolated polypeptide comprising an amino acid sequence selected from the 
group consisting of any one of the polypeptides encoded by SEQ ID NO: 1-438, the 
mature protein portion thereof, or the active domain thereof. 

21 . The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide 
array. 

22. A collection of polynucleotides, wherein the collection comprising the sequence 
information of at least one of SEQ ID NO: 1-438. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid 
array. 

24. The collection of claim 23, wherein the array detects full-matches to any one of 
the polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of 
the polynucleotides in the collection. 

26. The collection of claim 22, wherein the collection is provided in a computer- 
readable format. 

27. A method of treatment comprising administering to a mamimilian subject in need 
thereof a therapeutic amount of a composition comprising a polypeptide of claim 10 or 20 
and a pharmaceuticaUy acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising an antibody that specifically 
binds to a polypeptide of claim 10 or 20 and a pharmaceutically acceptable carrier. 
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Group VI, claim(s) 17-18, drawn to a method of identifying a compound that bind to the polypeptide of Group II. 
Group VII, claim(s) 27 drawn to a method of treatment using the polypeptide of Group II, 
Group VIII, claim(s) 28, drawn to a method of treatment using the antibody of Group III. 



The inventions listed as Groups I-VIII do not relate to a single general inventive concept under PCT Rule 13. 1 because, under PCT 
Rule 13.2, they lack the same or corresponding special technical features for the following reasons: 

The DNA of SEQ ID NO: 1-438 are different in structure and encode polypeptides having different structure and different function or 
substrate specificity. Therefore, in addition to electing one Group, applicants must further elect one DNA sequence or one 
polypeptide sequence encoded by SEQ ID NO: 1-438. 

The technical feature linking Groups I-VHI appears to be that they all relate to the DNA of SEQ ID NO: 1-438. 
However, Dumas et al. teach a polypeptide encoded by a polynucleotide that is 99% identical to SEQ ID NO:231 . 

Therefore, the technical feature linking the inventions of Groups I-X does not constitute a special technical feature as defined by PCT 
Rule 13.2, as it does not define a contribution over the prior art. 

Groups I-17I do not share a technical feature because a DNA, a protein, and an antibody are different compounds, each with its own 
chemical structure and function, and they have different utilities. The DNA molecule of Group I is not limited in use to the 
production of polypeptide of Gruoup II and can be used as a hybridization probe, and protein of Group II can be obtained by a 
materially different method such as by biochemical purification. The structure of an antibody of Group IH is not predictable from the 
structure of the protein of Group II and an antibody can cross-react with various proteins. 

The special technical feature of Group I is a DNA of SEQ ID NO: 1-438, vector comprising said DNA, host cell comprising said 
DNA and a method of producing polypeptides. 

The special technical feature of Group II is a polypeptide encoded by the DNA of Group I. 
The special technical feature of Group m is an antibody against the protein of Group n. 
The special technical feature of Group IV is a a method of detecting the DNA of Group I. 
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The special technical feature of Group V is a a method of detecting the polypeptide of Group II. 

The special technical feature of Group VI is a a method of identifying a compound that bind to the polypeptide of Group II. 

The special technical feature of Group VII is a a method of treatment using the polypeptide of Group D. 

The special technical feature of Group vm is a a method of treatment using the antibody of Group ID. 

Accordingly, Groups I-X are not so linked by the same or a corresponding special technical feature as to form a single general 
inventive concept. 
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